Modules - Githubissues

schungx / rhai

Rhai - An embedded scripting language for Rust [dev repo, may regularly force-push, pull from https://github.com/rhaiscript/rhai for a stable build]

https://github.com/rhaiscript/rhai

Apache License 2.0

9 stars 3 forks source link

Modules #7

Closed jhwgh1968 closed 4 years ago

jhwgh1968 commented 4 years ago

Proposal: namespaces

This proposal creates a new concept in Rhai called a namespace. The basics are:

Namespaces live in a Scope as a third ScopeType. They do not conflict with variables.
Every namespace has an alias, which is how Rhai code can access its contents.
A namespace alias is created by the import statement in Rhai.
Items within a namespace are accessed with a new "namespace access" operator, ::.
Namespaces may contain only functions and constants.
What defines the contents of a namespace is not defined by this proposal. It could be modules, packages, plugins, or custom modifications by downstream library users.

This proposal is designed to be minimal, and forward compatible with future decisions about namespace contents and access.

Syntax

In order to access a namespace, it must be bound to an alias with an import statement. There are two forms:

import "My New Namespace";
import "My Other Namespace" as ns2;

The primary argument to import is a UTF-8 string. This is not restricted to the characters in an identifier to allow greater flexibility in what can be keyed to a namespace, and to match import statements in more recent versions of JavaScript. (The "as" syntax was borrowed from Python.)

The second form explicitly gives the alias after the as token. The first form generates the alias by removing all tokens that are invalid in an identifier. The first example is equivalent to:

import "My New Namespace" as MyNewNamespace;

An import statement enables access to the contents of the namespace using the new "namespace access" operator, ::.

import "my namespace" as s;
let s = "abc";
let t = s[0]; // variables are not shadowed
let u = s::my_function();
let v = s::MY_CONST;

Final Thoughts

This is necessary for #3, but is turning out to be significant enough I'd like to nail down the syntax independently.

This has been talked about before, but it's not clear what decisions were made firmly about it, as many seemed tied to one particular implementation or another that was never finished.

I am also hoping, @schungx, that the code to parse the new syntax can be merged to master by itself. Even though no namespaces exist by default, it would still allow users to create them through interpreter modifications or custom scopes. I think that is a feature worth having, as it would partially help my use case for Rhai (a custom DSL) by itself.

I have the implementation about 20% done. If you approve, and can finish it faster, please do.

schungx commented 4 years ago

Namespaces live in a Scope as a third ScopeType. They do not conflict with variables.

Do you mean each namespace is a single entry inside a Scope? So essentially we can treat it like an object map, except that the fields are accessed via :: instead of .

This also means we can put import statements anywhere we can have let statements and all the shadowing etc. will simply just work.

Which also means that any access to constants within the module needs to be a lookup by name instead of a compiled offset like it is right now. Which will make namespaced variables a bit slower to access.

On the other hand, there doesn't seem to be any need to restrict namespace variables to constants; they can be updated and there shouldn't be any problem with it...

schungx commented 4 years ago

import "My New Namespace"; import "My Other Namespace" as ns2;

I'd assume that it will be much clearer if the first line imports the module in the script file "My New Namespace.rhai" (or a registered Rust module keyed to that name), but without any way to refer to it. Auto-generating a namespace name may be prone to name collisions plus the possibility that the filename contains no ASCII letters (e.g. all-Japanese or something). Therefore, the module is limited only to providing override functions that'll be available as soon as it is imported.

There will be needed a path-resolution mechanism to resolve namespace names to actual files on disk, with perhaps an overriding mechanism available for no-std builds.

The second line imports the module under a name, so the user can actually refer to variables and functions defined there. And I assume there'll be some sort of export statement that can export these variables/functions, plus probably a way to "re-export" an inner namespace (i.e. namespaces imported inside another namespace). Or, we cal have chained :: operators to refer to sub-namespaces.

In a perfect implementation, we'd want to have namespaces being private and needing an explicit export statement to re-export a sub-namespace for the greatest flexibility.

I'll create a new branch called namespace. Would you like to do the implementation or you want me to do it?

schungx commented 4 years ago

Branch https://github.com/schungx/rhai/tree/namespace created.

jhwgh1968 commented 4 years ago

Do you mean each namespace is a single entry inside a Scope? So essentially we can treat it like an object map, except that the fields are accessed via :: instead of .

Sort of.

The way I started implementing this was to create a third ScopeType, and then a couple of accessor methods to push and set them in a Scope. I am hoping this will make them behave the way I expect.

In particular, I imagine them following all the rules of variables, just separately of variables themselves. Example:

import "Module A" as a;
let a = a::foo(); // function in module A
let b = {
    import "Module B" as a;
    a::bar(); // function in module B
};
let c = a::baz(); // function in module A

Using the Scope type would make this behavior automatic.

Which also means that any access to constants within the module needs to be a lookup by name instead of a compiled offset like it is right now. Which will make namespaced variables a bit slower to access.

Perhaps. However, I could also imagine the engine (or the AST optimizer) doing a pass to resolve namespace members during Rhai compile time. With the way I have currently written plugins, that is possible.

On the other hand, there doesn't seem to be any need to restrict namespace variables to constants; they can be updated and there shouldn't be any problem with it...

You are correct. However, I did that on purpose to keep the interface from encouraging bad design.

I have seen programs in other languages which were procedural "balls of mud", where every function would read and write to some global state, and it became very difficult to reason about, or follow chains of events.

As far as I can tell, Rhai's scoping rules prevent that kind of design, and I rather like that. I'd rather not let namespaces be an escape hatch. In my opinion, Rust should be the keeper of all "global" state, and Rhai should take what it's given.

If you disagree, we can ease that restriction.

There will be needed a path-resolution mechanism to resolve namespace names to actual files on disk

I consider that part of the "back end" that is undefined. Personally, I was planning to hard-code a couple of namespaces into the Rhai environment of my scripts, by manually editing scopes and putting in names for Rust code inside the binary.

And I assume there'll be some sort of export statement that can export these variables/functions

I also consider that an implementation detail of the "back end". I was actually not thinking of a full "module system" for Rhai, as it would not personally benefit me.

I'll create a new branch called namespace. Would you like to do the implementation or you want me to do it?

You seem much faster than me at this sort of thing, and I appreciate that. So, here is what I'll do:

I'll open a PR with what I have so far. If you think it's a good start, merge it. If you don't, close it. Then, you can write the rest.

If you want to add more than just what I have written in this proposal, I am fine with that. Just keep one thing in mind about my current Rust plugin implementation:

As noted in #3, I currently have plugins doing a runtime lookup of their functions. This is currently done with a triat I call PluginDispatcher:

/// Represents a runtime lookup for a plugin.
///
/// This trait should not be used directly. Use the `#[plugin]` attribute for modules instead.
pub trait PluginDispatcher {
    fn call(&self,
            fn_name: &str,
            args: Box<dyn Iterator<Item=Dynamic>>
    ) -> Option<Result<Dynamic, EvalAltResult>>;
}

(I know this does not take into account some of our discussions. It will before I open a PR with this code in it.)

It should be easy for a plugin to say: "I want to create a namespace called X, and when one of its members is accessed, call this dispatcher function to look it up." If that is not the default mechanism, and I have to override something or implement a trait, that's fine.

schungx commented 4 years ago

I looked at your code and it is fine. I'll merge it and start from there.

I have given it some thoughts. It seems that the simplest implementation is to include a Function type to Dynamic. It shouldn't be used by external users at this point, but it opens up the possibility of supporting closures or function pointers in the future.

Then, it is dead simple to support a subscope - just treat it as a Map which is essentially HashMap<String, Dynamic>. Your idea of forcing subscope variables to be constants helps a great deal because then we don't need to decide whether a field is mutable or not - all of them are immutable. It simplifies things greatly.

Functions are stored under the Map as Dynamic::Function and variables stored as standard Dynamic fields. Then treat it almost exactly the same as treating an object map.

By reusing object map code, we can trivially implement renaming of functions by simply renaming the property name in the map, and implement selection of functions by omitting unimported ones.

gmorenz commented 4 years ago

Which also means that any access to constants within the module needs to be a lookup by name instead of a compiled offset like it is right now. Which will make namespaced variables a bit slower to access.

I don't see why this is the case. We can at ast build time see whether or not a variable is currently referring to a namespace imported one, and if so put in the offset of wherever we are storing that namespace variable. Provided we don't allow adding/removing variables from namespaces at runtime. Edit: And if we restrict ourselves to functions and other constants we could even inline the value of that variable and not reserve a spot for it in the first place.

On the other hand, there doesn't seem to be any need to restrict namespace variables to constants; they can be updated and there shouldn't be any problem with it...

Allowing mutation of namespaces will have significant implications on the rest of the language. This would be, I believe, the first case of mutable global state. Interaction with anything like threading proposals will be very substantial.

I would prefer to not allow it.

There will be needed a path-resolution mechanism to resolve namespace names to actual files on disk, with perhaps an overriding mechanism available for no-std builds.

I don't think we should be assuming any sort of disk access, since this language is intended to be embedded. I would suggest something like a trait NamespaceResolver { fn get_namespace(&self) -> &str } passed to the step that builds the ast. We can implement a FilesystemNamespaceResolver that works for getting namespaces from disk.

I know it's bikeshedding, but the rest of the language looks a lot like rust, maybe use mod and use as the keywords instead of namespace and use?

How does this interact with method syntax? Is foo.name::bar() allowed?

Functions are stored under the Map as Dynamic::Function and variables stored as standard Dynamic fields. Then treat it almost exactly the same as treating an object map.

Would this imply adding functions as first class objects?

schungx commented 4 years ago

I don't see why this is the case.

Because when parsing to AST, you're probably not going to load the imported scripts - they may be parsed separately.

I would prefer to not allow it.

I agree with you and @jhwgh1968 on this. Better provide globals on the Rust side. Everything constants then.

FilesystemNamespaceResolver

Yeah, I agree. A trait is probably the best way to handle this.

I know it's bikeshedding, but the rest of the language looks a lot like rust, maybe use mod and use as the keywords instead of namespace and use?

Well, it is different enough from Rust so that we can call it whatever we want... But to me Rhai seems closer to JS than to Rust actually...

I'm fine with either use or import syntax. Maybe we should have a poll to see what others think?

@jhwgh1968 what do you think? I can start renaming all "namespace" to "module" right now...

Would this imply adding functions as first class objects?

Yes, in the future we can do that!

gmorenz commented 4 years ago

Because when parsing to AST, you're probably not going to load the imported scripts - they may be parsed separately.

Why not?

I mean I can see parsing them beforehand, but it seems like a perfectly reasonable requirement that we need access to the imported scripts (in AST or source form) while we build the AST. Apart from being able to index things correctly, this also means that we can take optimization steps on the full AST, that we don't need to run a parse step every time the ast is executed (or do something like mutate the ast struct when it is executed), and it means that we could have an executor program that runs ASTs and derivatives without having to keep the parser around (provided eval is disabled).

But to me Rhai seems closer to JS than to Rust actually...

Semantics it's somewhat close to JS, syntax is practically identical to rust though :P

schungx commented 4 years ago

Hey guys, I have pushed a version into the namespaces branch. This one has import statements and module-name variables access working. It turns out not to be particularly difficult, simply leveraging all the scoping code.

There is no way to load a module at this point, so when the engine sees an import statement, it creates a dummy module with some dummy values, just for testing purposes.

You can compile that branch, and run the repl example to test it out.

Try:

import "abc" as hello;
print(hello::kitty);
print(hello::path);

schungx commented 4 years ago

Why not?

I guess so, why not?

When compiling to AST, we can potentially take a collection of AST's mapping to module names. The compiler can then merge all the necessary functions into the result AST so it can be kept around and evaluated as a stand-alone unit (no longer needing the module AST's).

During the compiling process, Engine can reach out to load and compile module script files if the AST's are not provided. The result being the same, a stand-alone AST that contains all the modules code.

Question: what about the following use case?

I (user) has a script that uses a bunch of modules. I don't have implementations of those modules yet - they are to be provided by other implementors. I only have their templates (such as function names, variable names etc.) I can still compile my code and then, when eval time comes, I take external modules and then run it together with my code.

This is the "dynamic linking" analog to the "static linking" above.

I suppose I can handle this use case simply by turning off loading external module scripts when the module AST's are not loaded, and just code a module-qualified function call.

jhwgh1968 commented 4 years ago

I'm fine with either use or import syntax. Maybe we should have a poll to see what others think?

@jhwgh1968 what do you think? I can start renaming all "namespace" to "module" right now...

In my case, I am writing a custom DSL as a balance between two objectives:

I want to draw a broad pool of contributors and users, including those who know scripting languages like Javascript and Python but have never touched Rust.
I want a scripting engine that is easy to maintain and hack new features into (unlike the full Lua, Python and Javascript engines I have looked at so far).

Thus, I am of the opinion that Rhai syntax should resemble those languages (dynamic typing, flexible scoping, their keywords, etc), while the semantics should be bound by Rust-like rules where helpful (no variables in modules, no built-in type punning/casting, easier exporting things from Rust, etc).

So I'm in favor of import due to its familiarity to both novice Python and advanced Javascript users.

That said, I will not push hard on most syntax things. I am interested in getting the sheds built. When it comes time to start using it, I fully expect to fork Rhai in order to repaint several sheds to my favorite color. :smile:

schungx commented 4 years ago

I fully expect to fork Rhai in order to repaint several sheds to my favorite color.

👍👍👍 - I hope you like purple.

So judging from the comments, I'm settling on the syntax:

import "xxxxxx" as abc;
import "abc";        // allow to directly create a module name if the filename is ASCII?

We'll call these "modules" to be consistent with Rhai and some other languages.

If needs be in the future, we can easily change import to use without touching any code at all.

A question is: do we allow users to pick 'n choose the functions they want to import? For example:

// Only import two functions, skip all the rest
import { func_a, func_b } from "xxxxx" as abc;

abc::func_a();
abc::func_b(42);

This is where the use syntax may diverge from import syntax:

// This seems a bit wierd
use { func_a, func_b } from "xxxx" as abc;

// This seems more natural for a 'use'
use {func_a as f1, func_b as f2} from "xxx";

f1();
f2(42);

Of course then we'll have to handle the massive amount of potential function name conflicts with local script code, so it may not be the best idea.

schungx commented 4 years ago

OK, I've pushed a new version to namespaces (maybe I should rename the branch to modules?). This version implements module-qualified function calls.

There is now a rudimentary API for the Module type. You can create a Module and then use scope.push_module("my_module", module); to add it into the scope.

There is an API to add Rust functions into the Module. See the test_sub_module test in tests/modules.rs.

@jhwgh1968 I would anticipate that your macros will actually generate the necessary set_fn calls to register functions into a module. I expect the current functionalities is enough for what you're trying to do. We're only lacking creation of a module from an AST, plus handle calls to script-defined functions.

schungx commented 4 years ago

The latest implements the ModuleResolver trait as suggested by @gmorenz

There are three built-in module resolvers:

FileModuleResolver - default, loads from script file
NullModuleResolver - always returns error, default for no-std
StaticModuleResolver - adds modules first

Right now, all functions defined in a script file is automatically export-ed. I haven't implement the export or pub or public etc. keyword yet.

All variables defined on global level is export-ed as well.

jhwgh1968 commented 4 years ago

Thanks, @schungx. I will continue my commentary in regards to plugins on the other issue.

schungx commented 4 years ago

The modules work is pretty much done with all features implemented.

I did an overhaul on efficiency and pre-compute all function call hashes wherever I can. It doesn't change the Module API much, so it shouldn't affect you a lot. Beware, though, that the set_fn to add a function into a module now takes an additional access parameter specifying whether that function should be exported or be kept private. In most cases I suppose you'd want to export, which is the default.