polkadot-fellows / RFCs

Proposals for change to standards administered by the Fellowship.
https://polkadot-fellows.github.io/RFCs/
Creative Commons Zero v1.0 Universal
109 stars 47 forks source link

Support allocator inside of runtime #61

Closed yjhmelody closed 4 months ago

yjhmelody commented 6 months ago

rendered

Related PR https://github.com/paritytech/polkadot-sdk/pull/1658

BTW, I'm not a member of polkadot fellows, does this have any impact on RFC implementation?

yjhmelody commented 6 months ago

I mentioned it in there already, it's basically orthogonal @bkchr

tomaka commented 6 months ago

I don't see how https://github.com/polkadot-fellows/RFCs/pull/4 is orthogonal. To me, it's one or the other, as they each render the other PR pointless. The only exception is we could do #61 plus some parts of #4 (in other words not all of #4).

From a pure logical point of view, I think that https://github.com/polkadot-fellows/RFCs/pull/4 is fundamentally a better idea because it doesn't complexify the flow of execution. With what this PR proposes, the host will call a runtime function, then the runtime can call a host function, which can call a runtime function (alloc, realloc, or dealloc). If alloc, realloc or dealloc lock a mutex that was already locked before the allocation/free, then you've got a deadlock. If alloc, realloc or dealloc call a host function, then you might have a stack overflow. You can counter-argue by saying that this is just a matter of being careful and that it won't happen in practice, but I think that it's still a point in favor of https://github.com/polkadot-fellows/RFCs/pull/4.

When it comes to performances, I have no idea which one of the two is better and I don't think it's possible to determine that without implementing them and comparing.

In my opinion, if either #61 or #4 (or #61 plus some parts of #4) is significantly faster than the other, then we should go for it, but if they compare more or less equal then we should go for #4.

yjhmelody commented 6 months ago

@tomaka

From a pure logical point of view, I think that #4 is fundamentally a better idea because it doesn't complexify the flow of execution.

I do not think so, it will never complexify the flow of execution, see the draft PR, it's very clear.

If alloc, realloc or dealloc lock a mutex that was already locked before the allocation/free, then you've got a deadlock. Why there is a lock ? it's single thread. Even if there is a lock in wasm, this is a runtime logic bug, which will be encountered in runtime_apis, too. You can't prevent an exported function having a bug or falling into an infinite loop.

If alloc, realloc or dealloc call a host function, then you might have a stack overflow.

runtime_apis also wil encounter this problem.

You can counter-argue by saying that this is just a matter of being careful and that it won't happen in practice, but I think that it's still a point in favor of https://github.com/polkadot-fellows/RFCs/pull/4.

This is a matter of code practice, why don't you worry about runtime_api encountering such problems?

I don't see how https://github.com/polkadot-fellows/RFCs/pull/4 is orthogonal. To me, it's one or the other, as they each render the other PR pointless. The only exception is we could do https://github.com/polkadot-fellows/RFCs/pull/61 plus some parts of https://github.com/polkadot-fellows/RFCs/pull/4 (in other words not all of https://github.com/polkadot-fellows/RFCs/pull/4).

The reason why I think it is orthogonal is that

If you care about performance, you have to compare before deciding whether to change. If you care about ease of use/versatility, then you should probably consider whether both can be supported at the same time

The goals of this RFC are completely different from #4. I don't mind #4 moving forward. This RFC just provides more possibilities for #4 BTW.

yjhmelody commented 6 months ago

What I am most concerned about with this PR is the versatility of the runtime, which helps the substrate runtime expand. Haven't you discovered that by combining the runtime_api.reimplmentation and allow missing host functions configurations and removing the imported allocator, the substrate runtime can be a pure wasm (actually no need to call any host function). Whether it is integrating into other ecosystems or integrating other ecosystems into substrate, it is convenient.

yjhmelody commented 6 months ago

Hi @radkomih I think you will be concerned about this RFC

burdges commented 6 months ago

Around this and #4 ..

We'd ideally have the approximate goal that host calls should almost never pass owned data, only borrowed data like &'s [u8] and &'a mut [u8] buffers.

In principle, this requires all current host calls be replaced, not unlike #4 does, but realistically some hostcalls would still pass owned data if they really needed the owned data. We'd often permit -> Vec<u8> instead of passing an allocator callback for example, but we'd replace some by -> &'static [u8] host calls, and avoid -> Vec<u8> host calls in other ways.

We'd gradually gain more from this RFC as we replaced more host calls by borrowing flavors. Rust handles borrowed data nicest of course. Yet, other langauges could avoid inefficencies like call wrappers that take owned being the runtime allocator, copy it to owned data behind the system allocator, make the host call, and then allocate & copy in reverse. Instead, they'd make the hostcall directly on borrows of data managed by the runtime, either owned or borrowed.

In other words, I'd think this and changes like #4 should benefit each other.

yjhmelody commented 5 months ago

@bkchr Hi, want to know your thinking. I could continue to implement this RFC code if it's merged.

yjhmelody commented 4 months ago

This allows the verification block to use pure wasm (just replace the host functions with wasm functions), which can extend the runtime verification to a lot of scenarios

@tomaka @bkchr

koute commented 4 months ago

Just my two cents: I'm in favor of https://github.com/polkadot-fellows/RFCs/pull/4 over this RFC, and I also strongly dislike the the fact that this would require us to call back into the runtime from a host function, which is a can of worms that once opened exposes us to fun things like the possibility of a malicious program of overflowing the host's stack, etc. I think that if possible we should keep the model of "host calls don't call back into the runtime".

As far as performance is concerned, https://github.com/polkadot-fellows/RFCs/pull/4 should be as fast or faster when optimally implemented (that is, if we guarantee that every host call will only have to cross the guest-host boundary twice at worst, and only once at the best).

yjhmelody commented 4 months ago

and I also strongly dislike the the fact that this would require us to call back into the runtime from a host function, which is a can of worms that once opened exposes us to fun things like the possibility of a malicious program of overflowing the host's stack, etc.

@koute I still have the same point of view, this does not conflict with RFC #4, and all current runtime apis rely on the method you say you don't like. Therefore, exporting alloc/dealloc will not increase this risk. The risk they face is the same as that of the common runtime API.

I would like to emphasize again that the focus of this RFC is to extend the substrate runtime capabilities, not simply to improve performance. It does not prevent the implementation of RFC#4, it just provides another possibility

koute commented 4 months ago

and all current runtime apis rely on the method you say you don't like.

Huh? But they don't call back into the guest? So can you clarify what you mean here?

yjhmelody commented 4 months ago

Huh? But they don't call back into the guest? So can you clarify what you mean here?

I mean the api defined by sp_api::decl_runtime_apis!. Alloc/dealloc is not different with them except the call conversion.

koute commented 4 months ago

Huh? But they don't call back into the guest? So can you clarify what you mean here?

I mean the api defined by sp_api::decl_runtime_apis!. Alloc/dealloc is not different with them except the call conversion.

Huh? But they don't call back into the guest? So can you clarify what you mean here?

I means sp_api::decl_runtime_apis!. Alloc/dealloc is not different with them except the call conversion.

Sorry, I still don't know what you mean. (:

The way the host calls currently work is that they don't call back into the runtime. The runtime calls the host function, the host function does its thing, and the control returns back to the host. Yes, the host does call into the runtime initially so technically you have a chain of "host -> guest -> host", but that only happens once.

So for the sake of argument let's assume we implement this RFC. It will allow us to call the guest's allocator from the host. What is the use case here?

1) Allocate the initial input payload to the runtime API. But if we implement #4 then this is unnecessary. 2) Allocate the dynamically sized return values of the host calls. Ideally we do not want to do this anyway (because it will allow a recursive "host -> guest -> host -> guest -> host -> guest -> ..." call loop) and if we implement #4 then this is unnecessary (because no such host functions will exist anymore).

So I agree with @tomaka and I don't see how this RFC is orthogonal to #4; if we implement #4 then it pretty much makes this RFC pointless, unless you actually have an use case that could make use of this?

yjhmelody commented 4 months ago

Supports a pure wasm environment to verify blocks. Currently, if you want pure wasm to verify blocks, you need to replace all host functions, but alloc itself cannot be replaced. This reduces the requirements on the host environment and requires only a wasm executor, so the verification logic can be executed on many chains.

koute commented 4 months ago

Supports a pure wasm environment to verify blocks. Currently, if you want pure wasm to verify blocks, you need to replace all host functions, but alloc itself cannot be replaced. This reduces the requirements on the host environment and requires only a wasm executor, so the verification logic can be executed on many chains.

But this is also true after implementing #4 as far as I can see, because the host-side allocator will not be used at all anymore.

yjhmelody commented 4 months ago

This doesn't seem to hold true in the case of legacy host functions. But rfc #4 is indeed a good improvement

koute commented 4 months ago

This doesn't seem to hold true in the case of legacy host functions.

...which is exactly the same as this RFC, no? (:

Not that it matters, because the legacy host functions won't be used anymore for new blocks and will only be kept for sync compatibility.

yjhmelody commented 4 months ago

Maybe you're right. ext_input_size_version_1/ext_input_read_version_1 are more like a smart contract style design. But my idea is to set up stubs(panics) for all import functions, but not actually call them, like parachain style. I'm a bit convinced by you

yjhmelody commented 4 months ago

Closed since RFC #4 will be implmented.