Consider splitting SDK into separately publishable layers

matklad commented 3 years ago

Our current SDK is pretty hight-level and opinionated – is presupposes borsh serialization, strongly-typed wrappers for types, etc. While it is a great fit for the average use-case (or at least aspires to be one), one size doesn't fit all!

It might be useful to publish unopinionated subsets of the SDK, which just give bindings to the system APIs, but don't prescribe any particular style of contracts.

There's not huge specific benefits here, but this unlocks the ecosystem for serendipitous experimentation. If we publish low-level sdk bindings, someone from the ecosystem might decide to publish a better hight-level SDK than ours, and that can have a huge positive effect.

I can see three possible layers we might want to split out:

near-sys our [sys.rs](https://github.com/near/near-sdk-rs/blob/master/near-sdk/src/environment/sys.rs) published as a crate.

near-api thin, no_std, no_alloc, safe wrapper over near-sys. [nesdie](https://github.com/austinabell/nesdie) is roughly this layer, but I think we also should get rid of alloc here, and use the APIs like:

 pub fn register_len(register_id: u64) -> Option<u64> {
 let len = unsafe 

 { sys::register_len(register_id) } 

;
 if len == core::u64::MAX 

 { None } 

 else 

 { Some(len) } 

}

pub fn read_register(register_id: u64, buf: &mut <span class="error">[u8]</span>) -> Result<usize, ()>{
 let len = register_len(register_id).ok_or(())?;
 if buf.len() < len 

 { return Err(()); } 

 unsafe 

 { sys::read_register(register_id, buf.as_ptr() as u64) } 

;
 Ok(len)
 }

near-sdk opinionated, alloc wrapper around near-api.

This depends on specifics, but, if it is actually possible to provide zero-overhead fully safe wrapper around sys, we might merge near-sys and near-api crates into one.

matklad commented 3 years ago

cc https://github.com/near/nearcore/issues/4485 <- this is the kind of problem which might prevent us from shipping 100% unopinonated near-api layer.

austinabell commented 3 years ago

cc near/nearcore#4485 <- this is the kind of problem which might prevent us from shipping 100% unopinonated near-api layer.

Honestly, I'm still a little confused about how this issue benefits the SDK other than just being able to move the validation of the buffer length from inside the wasm contract to the actual host function implementation. Is this what you mean as unopinionated, that this validation isn't part of that layer?

In the future, the new function signature would have to come under a new method name, correct? So if this is the case we can still ship this API layer and if this is changed on the runtime then we can just update the internals without changing the API as it's just a validation within the function?

Also, seems there are host functions that are not reflected in our syscalls, is there any reason not to include all if creating this common crate? (looking at https://github.com/near/nearcore/blob/d95a1edf9f4a3f751876907f8487e6eefa6594ec/runtime/near-vm-runner/src/imports.rs#L189)

matklad commented 3 years ago

If we had

extern "C" {
  fn read_register(registed_id: u64, buf: *mut u8, buf_len: usize) -> usize;
}

we'd be able to provide safe, zero cost, no-std wrapper:

pub fn read(id: u64, buf: &mut [u8]) -> Result<usize>

The read function would do just one sys call.

With fn read_register(registed_id: u64, buf: *mut u8) -> usize;, we can't have this. We have three suboptimal choices

A no-std safe function that does an extra syscal to do bounds check

pub fn read(id: u64, buf: &mut [u8]) -> Result<usize>

A safe function which doesn't duplicate systcalls, but requires an allocation

pub fn read(id: u64) -> Result<Vec<u8>>

An unsafe function without bounds checking:

pub unsafe fn read(id: u64, but: &mut [usize]) -> Result<usize>

In the future, the new function signature would have to come under a new method name, correct? So if this is the case we can still ship this API layer and if this is changed on the runtime then we can just update the internals without changing the API as it's just a validation within the function?

Yeah, we can go with "extra syscall" version now, and optimize it later.

Also, seems there are host functions that are not reflected in our syscalls, is there any reason not to include all if creating this common crate?

It makes sense to double check for each sepcific function why it isn't included. At least one (gas) shouldn't be exposed -- its a private impl detail of the runtime, which really shouldn't have been exposed at the first place, but that ship has sailed.

austinabell commented 3 years ago

If we had
extern "C" {
  fn read_register(registed_id: u64, buf: *mut u8, buf_len: usize) -> usize;
}
we'd be able to provide safe, zero cost, no-std wrapper:
pub fn read(id: u64, buf: &mut [u8]) -> Result<usize>
The read function would do just one sys call.

With fn read_register(registed_id: u64, buf: *mut u8) -> usize;, we can't have this. We have three suboptimal choices

A no-std safe function that does an extra syscal to do bounds check
pub fn read(id: u64, buf: &mut [u8]) -> Result<usize>
In the future, the new function signature would have to come under a new method name, correct? So if this is the case we can still ship this API layer and if this is changed on the runtime then we can just update the internals without changing the API as it's just a validation within the function?

Yeah, we can go with "extra syscall" version now, and optimize it later.

Yes, I suppose I meant to just indicate that even if the syscalls are "merged" in the future to include the bounds check (len checked against register_len in wasm) with the read_register call, the function signature could be:

pub fn read_register(register_id: u64, buf: &mut [u8]) -> usize

in both cases, where the implementation would just perform more syscalls to get length and panic until this change comes in, which would mean we wouldn't have an API breaking change on this new near-api. Is this reasoning sound?

I also noticed you wrapped the result in Result<usize> even when talking about when the change has been made, is this assuming there should always be a check before calling the syscall, or the error of insufficient buffer len being indicated in the u64 return as u64::MAX or something like this?

near / near-sdk-rs

Consider splitting SDK into separately publishable layers #463