koute / polkavm

A fast and secure RISC-V based virtual machine
Apache License 2.0
199 stars 44 forks source link

How to invoke guest function from host function implementation? #114

Closed xlc closed 2 months ago

xlc commented 2 months ago

I am trying to figure out how to pass arbitrary bytes between host and guest and to return some bytes, we need to allocate memory on guest program. I don't want to require the guest to preallocate a fixed size buffer or make it a two staged call. So I figured that I can implement an alloca function in guest, and have the host function to call this to allocate memory and write into it.

But I can't figure out how to invoke a function from guest program from the provided Caller object. I am using Linker::func_wrap to implement host functions.

koute commented 2 months ago

Recursive function calls between the host and the guest are not supported. If the guest calls a host function then you can't call back into the guest until the host function returns.

There are many ways you could work around this issue. At the program sizes you're aiming at it doesn't make much sense to include a full blown memory allocator in the guest anyway. Some of the possibilities are:

xlc commented 2 months ago

Thanks for the answer. I will do sbrk for the PoC and spend more time on this later to figure out the best strategy.

indirection42 commented 2 months ago

Recursive function calls between the host and the guest are not supported. If the guest calls a host function then you can't call back into the guest until the host function returns.

There are many ways you could work around this issue. At the program sizes you're aiming at it doesn't make much sense to include a full blown memory allocator in the guest anyway. Some of the possibilities are:

  • Have the host write the data to the bottom of guest's stack. (Simple, but subsequent calls will overwrite this data.)
  • Have the host write the data to a pointer given by the guest and return the length written, and in the guest first give the host an address starting at the bottom of the stack, then for the subsequent calls either keep the pointer after the previously written region (if you want to keep the data) or decrement it first (to "deallocate" the previously returned data). (This is a lot more flexible.)
  • Use sbrk to allocate guest memory from the host and copy the data there. (Simple, but will leak memory.)
  • Pass a pointer to the host; if it's not null then have the host copy the data there, and if it is null then use sbrk.
  • Write the data to the guest's stack and decrement the guest's stack pointer. (Essentially alloca. This is the most complex as Rust doesn't really support this, so you'd have to write the whole guest program in assembly.)

Really appreciate your work. I have some questions about the second and the more flexible way you mentioned. I assume the discussion apply to all the cases of having the host writing data to guest's stack, including passing args at the entrypoint of guest and returning values when guest calls host function. My questions are following:

  1. Does subsequent calls mean the calls from guest to host?
  2. Second, in guest, I was thinking the pointer the guest gives to host is calculated by guest contructing a zeroed object on stack, am I right?
koute commented 2 months ago
  1. Does subsequent calls mean the calls from guest to host?

Yes.

2. Second, in guest, I was thinking the pointer the guest gives to host is calculated by guest contructing a zeroed object on stack, am I right?

Well, yes, it can be a pointer to anywhere. To the guest's stack, to the heap, to a statically reserved piece of guest's memory, etc. And it doesn't necessarily need to be filled with zeros - the guest's memory is always deterministic, even if not explicitly initialized. (Of course you still need to properly handle uninitialized memory on Rust's side to avoid undefined behavior since Rust always assumes uninitialized memory is actually uninitialized.)