Closed bobbinth closed 2 years ago
Should standard library live in assembly crate or its own crate? I'm inclined to think that in its own crate (e.g., stdlib).
I agree with you that standard library should be placed somewhere, which explicitly defines the intention
What extensions should we use for assembly files? I'm thinking
.ma
(Miden assembly) - but open to other suggestions.
I was thinking about .masm
which clearly shows it's assembly file, but I'm okay to live with .ma
too :+1:
Should we have something similar to use statements in Rust? Or should standard library procedures be always referred to by fully qualified names? e.g., exec.crypto:hashes:blake3 vs use std:crypto:hashes:blake3 and then in the code exec.blake3.
I think we need something like this otherwise soon it'll be too much of cognitive load while just adding/ fixing these import paths. Also to handle name conflicts we can bring in some ideas from Rust, so that we can do name aliasing.
Would it build one giant string of all the code in the library and parse it in one go?
I think we should not go this way, instead we can do it in hierarchical fashion. Meaning say in some random Miden Assembly code
import std:crypto:hashes:blake3
proc.set_consts
...
end
begin
exec.set_consts
exec.blake3
...
end
when parser works on this input source string, it should note which standard library modules are being used and then it can only bring those portions ( say simply inlines it ) of standard library which are required for using exec.blake3
. For that we have to define one hierarchical module structure which is known to parser and when parser prepares final source string for compilation, it should only have those portions ( say nodes of dependency graph ) of standard library which are required for compiling current assembly file.
I think this is decrease the amount of work that parser and its helper components need to do when source processing.
I think we need something like this otherwise soon it'll be too much of cognitive load while just adding/ fixing these import paths. Also to handle name conflicts we can bring in some ideas from Rust, so that we can do name aliasing.
Agreed!
I think we should not go this way, instead we can do it in hierarchical fashion.
Agreed here as well. We need to figure out what's the simplest way to support such a structure and how flexible we want to make it. For example, do we want to support something like this as well (in addition to your example):
import std:crypto:hashes:*
proc.set_consts
...
end
begin
exec.set_consts
exec.blake3
...
end
I agree with the approach that’s been discussed here so far. Just to throw in my 2 cents, I also prefer .masm
as suggested by @itzmeanjan (although both are fine).
.masm
for extension it is!
A few more thoughts from my side:
I'd prefer we use the same general format for import statements as we do for all other instructions: period-separated keywords. Also, my slight preference is that we use use
rather than import
but I'm fine with ether. So, an import statement would look like:
use.std:crypto:hashes:blake3
Also, I'm thinking that for simplicity of parsing, we require that all import statements are grouped at the top of a file. What do you think?
Next, I'm wondering if we should limit imports to modules (and not to procedures within modules). Basically, in the above example blake3
would be a separate .masm
file. The semantics of this would be that the contents of a referenced module will be parsed in its entirety - and we won't need to do anything more sophisticated than that.
We could also allow importing multiple modules in the same statement (similar to how Rust does it). For example:
use.std:crypto:hashes:{blake3,sha256}
Though, the above might be a bit confusing given that we can't have spaces in the list of imports (at least with our current strategy).
Next, we probably need to have some kind of visibility modifiers. The simplest form is to have public and private procedures within a module (each module would have at least one public procedure). Not sure what would be the best way to label public procedures yet. Another alternative is to have a single public procedure per module. It's a bit limiting, but it might simplify a few things.
Lastly, I wondering if we should support "inline imports". Something like:
being
exec.std:crypto:hashes:blake3
end
My initial thinking is that this adds too much complexity. Although, if we go with single public procedure per module, this should be fairly straightforward.
One other thing, should we allow aliasing? For example, something like this:
use.std:crypto:hashes:blake3.as.blake
My main thoughts here are:
my slight preference is that we use use rather than import but I'm fine with ether.
I agree with use
.
I'm thinking that for simplicity of parsing, we require that all import statements are grouped at the top of a file.
I've never felt much benefit from allowing imports farther down in the file, so I prefer to keep them at the top too. Similarly, I would skip supporting inline imports like the example below for an initial implementation of imports:
being
exec.std:crypto:hashes:blake3
end
I think this is less ergonomic than use.std:crypto:hashes:blake3
and exec.blake3
and doesn't add enough value for the amount of complexity to justify be included in an initial implementation.
I'd prefer we use the same general format for import statements as we do for all other instructions: period-separated keywords. use.std:crypto:hashes:{blake3,sha256} use.std:crypto:hashes:blake3.as.blake
I think these both feel a bit strange. I know sticking to period-separated keywords makes sense for keeping the parsers simple and consistent, but I'd probably be inclined not to support these options initially and to revisit them both in a future iteration, at which point it might be worth considering using spaces in the import statements instead of period separators.
use.std:crypto:hashes:blake3
So to summarize - I think we should initially stick to having a series of simple use statements like this one at the top of the file. I would probably also use a double colon in the path use.std::crypto::hashes::blake3
, but that's my personal preference to minimize semantic differences between languages & maximize familiarity, especially if we're taking a lead from Rust (or thinking about doing so in the future).
The simplest form is to have public and private procedures within a module (each module would have at least one public procedure)
I don't think we would need more than this, so I think going in this direction makes sense.
Another alternative is to have a single public procedure per module. It's a bit limiting, but it might simplify a few things.
This might be a good starting place. It'll be simpler for now, but if we do it right, we should be able to relax the restriction in the future fairly cleanly to get to public/private procedures as mentioned above if we want to.
Agree with pretty much everything above. To sum up:
Library modules will be individual file with .masm
extension. Each module may export multiple procedures, and may have private (un-exported) procedures. For example, let's say we have a module for handling u256 arithmetic. It could be located in stadlib/math/u256.masm
file and could look like so:
proc.some_helper
<instructions>
end
export.add
<instructions>
end
export.mul
<instructions>
end
To use procedures from a module in a script we'd need to do the following:
use.std::math::u256
begin
<instructions>
u256::add
<instructions>
end
For now, we don't allow module aliasing and combining multiple imports into a single line. We also won't allow circular module imports. Mean, if module A uses module B, module B cannot import module A.
One open questions: should we allow "root" modules?
For example, should it be possible to have a module at std::math
. My initial thinking is: probably not. But I haven't thought through the limitations this would entail.
Closed by #121
Given that there are already things on the horizon which can go into standard library (e.g., #77), I think it might be a good time to think about how to organize standard library.
I'm inclined to keep things as simple as possible (at least at this point) - so, a few assumptions:
The way I'm currently thinking about is that the assembler will load the full standard library at instantiation time. For this, we'll need to build a procedure map (similar to procedure map for local procedures) in the assembler constructor. Then, when a script is being parsed, the assembler can look up a given procedure in this map and inline it as it does with local procedures.
A few preliminary questions to consider:
assembly
crate or its own crate? I'm inclined to think that in its own crate (e.g.,stdlib
)..ma
(Miden assembly) - but open to other suggestions.use
statements in Rust? Or should standard library procedures be always referred to by fully qualified names? e.g.,exec.crypto:hashes:blake3
vsuse std:crypto:hashes:blake3
and then in the codeexec.blake3
.