udem-dlteam / libs

Repo to develop new libraries for Gambit
5 stars 1 forks source link

gambit vs scheme libs #23

Open lassik opened 4 years ago

lassik commented 4 years ago

Here's the current set of libraries planned for R7RS-large (in addition to the R7RS-small ones):

@feeley You mentioned Gambit should get a _list module that is a superset of (scheme list) also known as SRFI 1. Are there others on the list you would like to extend with Gambit-only procedures?

lassik commented 4 years ago

If new Gambit-specific libraries are started, it would be easiest to understand if they are named (gambit foo) and are an extension of (scheme foo) where possible. I guess the equivalent namespace in Gambit's native system would be _foo. Thoughts?

feeley commented 4 years ago

I've been wavering on this issue and still do not know what is the best approach. There is a tension between what is convenient for "Gambit users" and "Scheme community users" (that would like to share code with Gambit users bidirectionally). Here are some of my concerns:

1) All modules including builtin modules should be versioned. Currently builtin modules are not versioned (in a sense the Gambit release version is the builtin module's version). Having versioned builtin modules would allow upgrading different parts of the system independently.

2) Gambit users should have a concise import declaration so that code is not overly "heavy". Always having to do (import (gambit xxx)) gets tiring fast. I've been thinking of using module aliases and predefining # (or _ or / or +) as an alias for github.com/gambit/standard so that (import (# list)) and (import (# list @2.0)) are possible. It also makes it easy to map a specific builtin module to a different implementation by adding an alias (for experimentation, etc). Using # has the advantage that it will (very probably) give a parsing error in other Scheme implementations, which conveys the meaning that this module is only usable with Gambit. On the other hand it prevents non-Gambit users to do (cond-expand (gambit (import (# list))) (else (import (srfi 1)))), so maybe one of the other choices is better.

I guess it is a question of managing the namespace of R7RS libraries. Currently (scheme ...) and (srfi ...) are reserved, but otherwise it is an unchartered territory which is up for grabs. This has to be resolved soon at the "Scheme community" level otherwise we'll end up with R7RS libraries that can't be shared because of name clashes of the at the library identifier level. I suggested reserving (scheme-system-name ...) and (URL ...) to the people or organisations that control those systems or domains. Maybe (_ ...) could be reserved to mean "this Scheme system".

So for the moment Gambit is using builtin modules with names containing a _ prefix. But that is just a placeholder. It would be great if Gambit could switch to a more permanent naming scheme for the next release so that module names containing a _ prefix would never have been released.

lassik commented 4 years ago

There is already a strong convention that (scheme-system-name ...) belongs to that Scheme implementation. I think about 5 or more implementations have their native libraries in such a namespace.

Is (import (gambit ...)) really too long to type? All the other Schemes do it as far as I can remember. I guess an (import (_ ...)) alias would work. However, _ has a strong connotation of "internal use only - I hope you know what you are doing" to people with a C background.

lassik commented 4 years ago

All of these currently work in the respective Schemes:

I stopped looking at this point - there are probably more :)

I'm encouraging Arthur to start a formal registry of "Scheme IDs" - identifiers naming implementations. Here's the collection so far, based on cond-expand feature identifiers: https://github.com/srfi-explorations/identifiers

lassik commented 4 years ago

So the most promising approach right now would be to extend the standard library namespace by formally recognizing those implementation IDs in addition to scheme and srfi.

feeley commented 4 years ago

OK... let's give it a try with a (gambit list) module. I will create it later today and we can experiment with it.

lassik commented 4 years ago

Great, thanks!

lassik commented 4 years ago

If you're versioning Gambit's native libraries, you could make the (gambit ...) library names be aliases for the current versions.

Perhaps there should be a standard alias like _ (but preferably something with less of an internals connotation) that would point to the native library namespace in each implementation.

lassik commented 4 years ago

The Gambit REPL incidentally auto-completes (import (g into (import (gambit by the way.

feeley commented 4 years ago

Now I remember another important concern with the longer names that include the system name... it makes it awkward to use the builtin module names on the command-line. Currently this is possible to run all the unit tests in the program mytests:

gsi _test/all . mytests

Note that it works by loading the builtin module _test/all whose only operation is to change a state variable of the _test module that causes the tests to not stop at the first failure.

It is much more convenient than:

gsi gambit/test/all . mytests

The gambit part is just noise because we know this is Gambit...

If '_' when used as a prefix of the module name meant "the current Scheme system" then the two command-lines would be equivalent.

lassik commented 4 years ago

This could also be viewed as a load path thing. If both gambit and gambit/test are contained under a directory on the load path (perhaps a virtual directory) then (import (gambit test)) and (import (test)) would import the same library. gsi gambit/test/all and gsi test/all would both work by the same principle. Users who think test is too much namespace pollution could change the load path so the (import (gambit ...)) or gsi gambit/ prefix is required.

However, perhaps it's too surprising if (import (string)) is an alias for (import (gambit string)) and things like that.

feeley commented 4 years ago

The problem is deeper than that, because the name (foo bar) might refer to (gambit foo bar) or to (foo bar) which is asking for trouble. An important design objective is to make library names unique, and using load path tricks ruins that.

lassik commented 4 years ago

We can already use load path tricks by setting the load path to anything we like. But I agree that it can be very confusing and probably isn't wise for implementations to do it by default.

lassik commented 4 years ago

ASCII doesn't offer too many promising single-character choices for a name.

What's left is % + - / : = @ ^ _

lassik commented 4 years ago

And : may be used for keywords in Scheme. IIRC Kawa has special sugar using @. Slash / is the pathname separator. I'm starting to see how you ended up with _ :)

feeley commented 4 years ago

Load path tricks (although possible) should not be proper solution to anything...

The . is usable in Gambit (and , too):

> (call-with-input-string "((1 2) (. 3))" read)
((1 2) (. 3))
> (call-with-input-string "((1 2) (, 3))" read)
((1 2) (,3))
lassik commented 4 years ago

Does gsi ./foo look for foo in the current directory though?

feeley commented 4 years ago

Yes... so in fact . is not my preferred choice... as you say _ seems like the main logical option. To me the _ prefix in C is used for "implementation dependent" functions, and that is my intuition here.

feeley commented 4 years ago

The thing is that the library name _foo could be seen as referring to (gambit foo) or simply (_foo) as is the case currently. With _foo referring to (gambit foo) would make it possible to version the library.

feeley commented 4 years ago

So maybe the thing to do is (at the Scheme community level) reserve all library names starting with _ to be system dependent.

lassik commented 4 years ago

Would _/foo be an acceptable compromise for command line use? That would map directly to (_ foo), which could be an alias for (gambit foo). That in turn could be an alias for (gambit foo @2.0) or (github.com/gambit/standard foo @2.0)

If people want to use modules from a directory called _ within the current directory, they could type gsi ./_/foo

feeley commented 4 years ago

Well, if (_whatever) is system dependent, then Gambit can map that to (gambit whatever) with the special case (_) that maps to (gambit). So in fact they would be equivalent... i.e. (_whatever) = (_ whatever) = (gambit whatever) = (github.com/gambit whatever).

lassik commented 4 years ago

OK, that will work if we reserve all identifiers beginning with _ as internal - not just the one-character identifier _.

If the library system has a general way to make identifier prefixes, that could be useful in big systems. For example, a srfi- prefix could map into the srfi namespace for legacy compatibility situations. Also, R6RS does not allow integers as part of library names so (srfi 123) is not valid and they have to use (srfi :123). The rewritier could be used to get rid of the : prefix under the srfi namespace.

Perhaps it would be useful to work out renaming semantics that cover the needs of all implementations, and write a SRFI about it. The RnRS import rename system has already proved very useful; renaming library names would probably be equally useful.

lassik commented 4 years ago

I have an idea for a completely portable SCHEME_PATH environment variable based on such a rewriting system. Any directory listed on SCHEME_PATH could contain a metadata file describing renaming rules, implementation requirements (e.g. R6RS vs R7RS), etc. for the Scheme libraries in that directory. The metadata file could be written in S-expressions, which would make it easily extensible to cover future needs.