This SMIP specifies the required changes in SVM to support code reuse between Templates.
It describes an experimental idea that might be put to the test.
That being said, it's advised to implement ASAP the Storage Layout Section Index detailed later in this document.
Even if this SMIP isn't implemented at the Wasm level - most ideas can be implemented in the higher-level layers such as SVM SDK.
The Storage Layout Section Index is low-hanging fruit and will make the current SVM codebase future compatible with implementing this SMIP (or something similar within the SVM SDK).
The underlying assumption is that the Wasm code we deal with has no polymorphism.
In other words, usage of the call_indirect opcode is forbidden here.
Having a Fixed-Gas Wasm satisfies this restriction, but this solution covered in this document could apply to non-Fixed-Gas Wasm programs as long as the above rule is enforced.
Goals and Motivation
The SVM Templates were introduced to avoid deploying the same code when one would like to launch a new Account of an application. The main reasons for introducing Templates were for code reuse and saving on Storage (which is an expensive resource).
This SMIP attempts to take it one step further and propose a mechanism for reusing specific code functions between different Templates. The motivation here is to reuse popular pieces of code that otherwise had to be implemented each time different Templates would require it.
Another angle to look at is by thinking of composability - given "black boxes" we know to work well, we'd like to reuse them. (see the "Open-Closed" Principle).
This SMIP stays on the Wasm level of a Template. Other ideas revolving around specific Template Compilers (such as the SVM SDK in Rust) could probably spring to mind - but these are out of scope for this document.
For example, say we have a Template denoted as Template A in its Wasm form. The Template contains Wasm Functions f1_wasm, f2_wasm, and f3_wasm.
While working on a new Template implementation named Template B(let's settle on using Rust for coding it, but it doesn't matter), we have programmed (in Rust) functions f4_rust and f5_rust, and now we're about to start coding f6_rust.
The thing is that we already have that f6_rust logic implemented somewhere else but not in Rust (and even if it's been coded in Rust - we don't have the source code either).
The f3_wasm seems to have the same logic we'd like to have for our f6_rust. It's too bad there's nothing we can do about it. That's where this SMIP comes to play - we'd like to be able to have f3_wasm reused as our f6_rust (in high-level).
The idea is very similar to using a Linker. So this proposal for a solution will be similar to using Static Linking in our case.
In other words, we'll reuse the f3_wasm code, but it'll exist twice (once for each Template) as opposed to once (which would make it more as Dynamic Linking).
Good real-life examples for functions to be reused are anything related to Signatures Schemes.
Say we have a Template with a verify method implementing 2-3 MultiSig. This Signatures Scheme could probably be reused in many other Templates.
If we could take the verify code in its Wasm format and somehow inject it into other Templates, we can save reimplementing the 2-3 MultiSig scheme.
High-level design
Naming:
Template Origin - the Template from which we would like to reuse part of its code in other newly written Templates.
Template New - the Template we're currently coding and would like to have some of its code taken from another Template Origin
The implementation contains a couple of phases:
Create a Temporary Stub
Storage Layouts Relocation
Functions Indexes Relocation
Create a Temporary Stub
First, we need to decide what functions of Template Origin we'd like to reuse within Template New.
For simplicity, let's assume we have only one such function, denoted as f1_origin, we'd like to reuse, and that f1_origin doesn't call any other Wasm functions (it could call imported functions, though).
We'll create a corresponding Wasm function; let's name it f1_new under Template New.
The function signature of f1_new will be the same as f1_origin, and its body will only contain some random return value so that the Wasm will be valid. (if the function should return i32, then the body could contain an opcode returning zero, for example, etc.)
The compiler (SVM SDK or similar) we'll be in charge of emitting these Stubs.
Storage Layouts Relocation
So now we have a function named f1_new with the same signature as f1_origin and an empty body (besides returning something to make the Wasm code valid). If each Wasm function were completely stateless, then we'd be done at this point.
Unfortunately, that's not enough!
A Wasm function of a Template might interact with the running Account's Storage. For example, it could read from or write to its Storage Variables. If f1_origin contains a read operation such as svm_get32(5), it doesn't imply that running it intact in f1_new will work as expected.
A reminder: The svm_get32(5) asks to read the storage variable indexed 5.
To have f1_new working properly, we'll have to extend the Storage specification of Template New.
Say that Template Origin had a single Storage Layout Section containing 10 variables and that Template New also had one Storage Layout Section with 20 variables defined. The accommodated Layout of Template New will now contain two Storage Layout Sections.
Template New will have one new Storage Layout Section added. (the old one with the 20 variables and a new one of Template Origin with the 10 variables).
For simplicity, the new Storage Layout Section will be positioned second. The old code of Template New will continue working the same, and we'll need to relocate each interaction against the Storage Calls for the code reused from Template New (the code we clone out of f1_origin).
Right now each Storage-related host function is of the form svm_getXXX(var_id) or svm_setXXX(var_id, val). The notion of different Sections isn't reflected in the current design.
Things get more complicated when each Template has multiple Storage Sections - and not only a single one.
It seems that the most straightforward tactic for doing that relocation will be by introducing another dimension - the Section Index.
The SMIP proposes to attach a Section Index alongside each variable. So svm_getXXX(var_id) becomes svm_getXXX(var_id, section_idx) and svm_setXXX(var_id, val) becomes svm_setXXX(var_id, val, section_idx).
This new layer of indirection adds more flexibility since we could now have multiple variables with the same index, each associated with a different Section Index. We turn the variable id from a Global unique identifier to a local one within each Storage Section.
Let's now return to our example. The Template New will contain two sections: the original one with 10 variables and the new one taken from Template Origin having 20 variables.
Each svm_getXXX call in f1_origin (the function we'll like to reuse within f1_new) will be of the pattern: svm_getXXX(var_id, 0) since there is only a single section. Similarly, each svm_setXXX under f1_origin will be of the form set_setXXX(var_id, val, 0).
Under the f1_new each such call should become svm_getXXX(var_id, 1) or svm_setXXX(var_id, val, 1).
If Template New had 3 Storage Layout Sections then f1_new calls should have been accommodated to: svm_getXXX(var_id, 3) and svm_setXXX(var_id, val, 3).
I general if Template New had N Storage Layout Sections then each call would have to be translated:
The Storage Layout Section #0 under Template Origin will have index N under Template New.
The Storage Layout Section #1 under Template Origin will have index N + 1 under Template New
and so on...
The remaining question is how to implement the relocation in code - at the Wasm level.
It can be a bit tricky; for example, the Wasm code could, in theory, have: svm_getXXX(V, S) where V or S (or both) are not known at compile-time.
We need to be able to apply the relocation to any Wasm code.
Wasm is a Stack-Machine; each parameter is pushed onto the Stack when calling a function.
We need to detect calls to functions that interact against the Storage and then increment the last call parameter (the one standing for the Section Index). After executing the last opcode before the Wasm call one, the top of the Stack should hold the Section Index.
The transformation we want to do is to:
Push N on top of the Stack (see what is N in the explanation above)
Pop the Stack two top values; let's denote them as a and b
Compute a + b and push that value back into the Stack
In Wasm opcodes, it should look like this:
;; Before
call svm_get32
;; After
i32.const N ;; pushes `N` (it's a constant number)
i32.add ;; pops the Stack two top values, adds them, and pushes the result back
call svm_get32
Functions Indexes Relocation
Relocation of the Storage Layouts isn't the whole story.
Calling svm_get32 could look as call 0 under one Template and as call 1 at another.
The code taken from Template Origin needs to use the same Functions indexes to play nicely in Template New.
It can be done by scanning the Function Indexes of each Template and then swapping each call in the reused code to use the one at Template New.
The assumption here is that both Template(s) have the same functions imports. Or that the imports used by Template Origin are a subset of the ones of Template New
If f1_new calls other inner functions, each one will have to be added to the Functions Indexes under Template New. (see Reusing Multiple Functions later).
Other
Global Variables
On top of the above, the Wasm code of a Template will probably have a couple of Global variables.
These variables are likely to be associated with Memory Management (pointers to the Stack and Heap).
In general - these should stay intact. So, for example, if both Template Origin and Template New have been compiled from LLVM bytecode, things will likely work as expected. If this isn't the case then the whole reuse attempt would not work.
Reusing multiple functions
We said that the code of f1_origin didn't call to other Wasm functions (only to imported host functions).
In case f1_origin calls other Wasm functions, then we'll have to relocate these as well.
Of course, the Storage Layout Sections will have to be relocated only once.
However, we'll have to make sure also to relocate the functions indexes of these functions (and have these indexes added to the Functions Indexes of the Template New)
Questions/concerns
As said under the Overview, this SMIP outlines an experimental idea, and it might be incomplete.
The primary motivation is to have the capability to reuse verify implementations across different Templates.
Dependencies and interactions
Immutable Storage
As stated at the beginning of this document, It's recommended to implement the Storage Layout Section Index ASAP - even before executing the Immutable Storage SMIP.
Even if this SMIP isn't executed - the same concepts raised could be applied on higher levels such as SVM SDK.
SVM Code Reuse between Templates
Overview
This SMIP specifies the required changes in SVM to support code reuse between Templates. It describes an experimental idea that might be put to the test.
That being said, it's advised to implement ASAP the
Storage Layout Section Index
detailed later in this document. Even if this SMIP isn't implemented at the Wasm level - most ideas can be implemented in the higher-level layers such as SVM SDK.The
Storage Layout Section Index
is low-hanging fruit and will make the current SVM codebase future compatible with implementing this SMIP (or something similar within the SVM SDK).The underlying assumption is that the Wasm code we deal with has no polymorphism. In other words, usage of the
call_indirect
opcode is forbidden here.Having a
Fixed-Gas Wasm
satisfies this restriction, but this solution covered in this document could apply to non-Fixed-Gas Wasm
programs as long as the above rule is enforced.Goals and Motivation
The SVM Templates were introduced to avoid deploying the same code when one would like to launch a new
Account
of an application. The main reasons for introducing Templates were for code reuse and saving onStorage
(which is an expensive resource).This SMIP attempts to take it one step further and propose a mechanism for reusing specific code functions between different Templates. The motivation here is to reuse popular pieces of code that otherwise had to be implemented each time different Templates would require it.
Another angle to look at is by thinking of composability - given "black boxes" we know to work well, we'd like to reuse them. (see the "Open-Closed" Principle).
This SMIP stays on the Wasm level of a Template. Other ideas revolving around specific Template Compilers (such as the SVM SDK in Rust) could probably spring to mind - but these are out of scope for this document.
For example, say we have a Template denoted as
Template A
in its Wasm form. The Template contains Wasm Functionsf1_wasm
,f2_wasm
, andf3_wasm
.While working on a new Template implementation named
Template B
(let's settle on using Rust for coding it, but it doesn't matter), we have programmed (in Rust) functionsf4_rust
andf5_rust
, and now we're about to start codingf6_rust
.The thing is that we already have that
f6_rust
logic implemented somewhere else but not in Rust (and even if it's been coded in Rust - we don't have the source code either).The
f3_wasm
seems to have the same logic we'd like to have for ourf6_rust
. It's too bad there's nothing we can do about it. That's where this SMIP comes to play - we'd like to be able to havef3_wasm
reused as ourf6_rust
(in high-level).The idea is very similar to using a Linker. So this proposal for a solution will be similar to using Static Linking in our case.
In other words, we'll reuse the
f3_wasm
code, but it'll exist twice (once for each Template) as opposed to once (which would make it more as Dynamic Linking).Good real-life examples for functions to be reused are anything related to Signatures Schemes. Say we have a Template with a
verify
method implementing 2-3 MultiSig. This Signatures Scheme could probably be reused in many other Templates.If we could take the
verify
code in its Wasm format and somehow inject it into other Templates, we can save reimplementing the 2-3 MultiSig scheme.High-level design
Naming:
Template Origin
- the Template from which we would like to reuse part of its code in other newly written Templates.Template New
- the Template we're currently coding and would like to have some of its code taken from anotherTemplate Origin
The implementation contains a couple of phases:
Create a Temporary Stub
First, we need to decide what functions of
Template Origin
we'd like to reuse withinTemplate New
. For simplicity, let's assume we have only one such function, denoted asf1_origin
, we'd like to reuse, and thatf1_origin
doesn't call any otherWasm
functions (it could call imported functions, though). We'll create a corresponding Wasm function; let's name itf1_new
underTemplate New
.The function signature of
f1_new
will be the same asf1_origin
, and its body will only contain some random return value so that the Wasm will be valid. (if the function should returni32
, then the body could contain an opcode returning zero, for example, etc.)The compiler (SVM SDK or similar) we'll be in charge of emitting these Stubs.
Storage Layouts Relocation
So now we have a function named
f1_new
with the same signature asf1_origin
and an empty body (besides returning something to make the Wasm code valid). If each Wasm function were completely stateless, then we'd be done at this point.Unfortunately, that's not enough! A Wasm function of a Template might interact with the running
Account's Storage
. For example, it could read from or write to itsStorage Variables
. Iff1_origin
contains a read operation such assvm_get32(5)
, it doesn't imply that running it intact inf1_new
will work as expected.A reminder: The
svm_get32(5)
asks to read the storage variable indexed 5.To have
f1_new
working properly, we'll have to extend the Storage specification ofTemplate New
.Say that
Template Origin
had a singleStorage Layout Section
containing 10 variables and thatTemplate New
also had oneStorage Layout Section
with 20 variables defined. The accommodated Layout ofTemplate New
will now contain twoStorage Layout Sections
.Template New
will have one newStorage Layout Section
added. (the old one with the 20 variables and a new one ofTemplate Origin
with the 10 variables).For simplicity, the new
Storage Layout Section
will be positioned second. The old code ofTemplate New
will continue working the same, and we'll need to relocate each interaction against the Storage Calls for the code reused fromTemplate New
(the code we clone out off1_origin
).Right now each Storage-related host function is of the form
svm_getXXX(var_id)
orsvm_setXXX(var_id, val)
. The notion of different Sections isn't reflected in the current design.Things get more complicated when each Template has multiple
Storage Sections
- and not only a single one. It seems that the most straightforward tactic for doing that relocation will be by introducing another dimension - theSection Index
.The SMIP proposes to attach a
Section Index
alongside each variable. Sosvm_getXXX(var_id)
becomessvm_getXXX(var_id, section_idx)
andsvm_setXXX(var_id, val)
becomessvm_setXXX(var_id, val, section_idx)
.This new layer of indirection adds more flexibility since we could now have multiple variables with the same index, each associated with a different
Section Index
. We turn thevariable id
from a Global unique identifier to a local one within eachStorage Section
.Let's now return to our example. The
Template New
will contain two sections: the original one with 10 variables and the new one taken fromTemplate Origin
having 20 variables.Each
svm_getXXX
call inf1_origin
(the function we'll like to reuse withinf1_new
) will be of the pattern:svm_getXXX(var_id, 0)
since there is only a single section. Similarly, eachsvm_setXXX
underf1_origin
will be of the formset_setXXX(var_id, val, 0)
.Under the
f1_new
each such call should becomesvm_getXXX(var_id, 1)
orsvm_setXXX(var_id, val, 1)
. IfTemplate New
had 3Storage Layout Sections
thenf1_new
calls should have been accommodated to:svm_getXXX(var_id, 3)
andsvm_setXXX(var_id, val, 3)
.I general if
Template New
hadN Storage Layout Sections
then each call would have to be translated:svm_getXXX(var_id, section_idx)
⇒svm_get(var_id, section_idx + N)
svm_setXXX(var_id, val, section_id)
⇒svm_setXXX(var_id, val, section_idx + N)
The
Storage Layout Section #0
underTemplate Origin
will have indexN
underTemplate New
. TheStorage Layout Section #1
underTemplate Origin
will have indexN + 1
underTemplate New
and so on...The remaining question is how to implement the relocation in code - at the Wasm level. It can be a bit tricky; for example, the Wasm code could, in theory, have:
svm_getXXX(V, S)
whereV
orS
(or both) are not known at compile-time.We need to be able to apply the relocation to any Wasm code. Wasm is a Stack-Machine; each parameter is pushed onto the
Stack
when calling a function. We need to detect calls to functions that interact against theStorage
and then increment the last call parameter (the one standing for theSection Index
). After executing the last opcode before the Wasmcall
one, the top of theStack
should hold theSection Index
.The transformation we want to do is to:
N
on top of theStack
(see what isN
in the explanation above)Stack
two top values; let's denote them asa
andb
a + b
and push that value back into theStack
In Wasm opcodes, it should look like this:
Functions Indexes Relocation
Relocation of the
Storage Layouts
isn't the whole story. Callingsvm_get32
could look ascall 0
under oneTemplate
and ascall 1
at another.The code taken from
Template Origin
needs to use the sameFunctions indexes
to play nicely inTemplate New
. It can be done by scanning theFunction Indexes
of eachTemplate
and then swapping eachcall
in the reused code to use the one atTemplate New
.The assumption here is that both
Template(s)
have the same functions imports. Or that the imports used byTemplate Origin
are a subset of the ones ofTemplate New
If
f1_new
calls other inner functions, each one will have to be added to theFunctions Indexes
underTemplate New
. (seeReusing Multiple Functions
later).Other
Global Variables
On top of the above, the Wasm code of a Template will probably have a couple of Global variables. These variables are likely to be associated with Memory Management (pointers to the
Stack
andHeap
). In general - these should stay intact. So, for example, if bothTemplate Origin
andTemplate New
have been compiled from LLVM bytecode, things will likely work as expected. If this isn't the case then the whole reuse attempt would not work.Reusing multiple functions
We said that the code of
f1_origin
didn't call to other Wasm functions (only to imported host functions). In casef1_origin
calls other Wasm functions, then we'll have to relocate these as well.Of course, the
Storage Layout Sections
will have to be relocated only once. However, we'll have to make sure also to relocate the functions indexes of these functions (and have these indexes added to theFunctions Indexes
of theTemplate New
)Questions/concerns
As said under the
Overview
, this SMIP outlines an experimental idea, and it might be incomplete. The primary motivation is to have the capability to reuseverify
implementations across different Templates.Dependencies and interactions
Immutable Storage
As stated at the beginning of this document, It's recommended to implement theStorage Layout Section Index
ASAP - even before executing theImmutable Storage
SMIP. Even if this SMIP isn't executed - the same concepts raised could be applied on higher levels such as SVM SDK.Stakeholders and reviewers
@noamnelke @lrettig @neysofu @avive @moshababo