Syntax for Fully Qualified Names

DavePearce commented 7 years ago

Currently, the mechanism for representing a fully qualified name is based on that of Java. Namely, it uses the "dot" syntax whereby we have names such as whiley.lang.Int and whiley.lang.Int.u8. One difficulty with this syntax is that we cannot distinguish paths from names:

Path. A path identifies a specific Whiley file within a specific package. For example, whiley.lang.Int identifies file Int.whiley in package whiley.lang.
Name. A name identifiers a named declaration within a specific Whiley file. For example, whiley.lang.Int.u8 identifiers the type declaration u8 within the Whiley file whiley.lang.Int.

There are a number of ways in which we could syntactically distinguish paths from names. Some examples:

this.is.path:Name. Thus, for our two examples above we would have whiley.lang.Int and whiley.lang.Int:u8. A short form of the latter would be Int:u8.
this::is::path.Name. Thus, for our two examples above we would have whiley::lang::Int and whiley::lang::Int.u8. A short form of the latter would be Int.u8.

The latter is perhaps somewhat closer to Rust's name syntax.

There are some implications here. In particular, it means there is a strong distinction between a file and a potentially nested namespace.

bambinella commented 7 years ago

I find the C++ "::" notation to be noisy. I think Ada is more clean-looking (used for something else in Ada, but...):

whiley::lang::Int.something

vs.

whiley'lang'Int.something

(In Ada I think it is something like name1.name2'attribute which isn't really the same)

I am thinking that the mnemonic would be that ' would signify genitive, e.g. «Dave's». Anyway, this is what I've landed on previously for my own "on paper syntax" design. But it is one of those cases where you end up with all alternatives being so-so-not-so-great.

My thinking though is that the namespace part should be more visually contracted to indicate stronger connection as a unit than the distinct objects which should be more paced out.

"::" does the wrong thing there. It introduces too much separation between the path elements, so the whole namespace path cannot be digested as a single unit by our visual system.

When programs grow I tend to use fully qualified names all the way. These days I never import the std namespace in C++ for instance. So my code becomes very cluttered with the "::" syntax, e.g.:

mylib::math::somefunction( std::begin(c), std::end(c) )

vs

mylib'math'somefunction( std'begin(c), std'end(c) )

I don't know... :-|

But what if Whiley introduces standardized methods/operators with prescribed properties from namespaces like «statistical»:

db::getobject(database,key).statistical::mean()

vs

db'getobject(database,key).statistical'mean()

I think the one with ' looks better in this case.

DavePearce commented 7 years ago

Currently, I think I'm now leaning towards the C++ / Rust style. One reason for this is simply that Whiley wants to compete in this space. This would also mean a flatter hierarchy for the standard library. For example, we'd have something like this:

std::ascii --- For all ASCII related stuff (currently whiley.lang.ASCII)
std::io --- For all basic definitions of readers / writers (currently whiley.io.Reader, whiley.io.Writer, etc).
std::array --- For all array manipulation functions (currently whiley.lang.Array)
std::int --- For all integer coercion operations, and other functions (e.g. parsing integer from string) (currently whiley.lang.Int)
std::math --- For all math related functions, e.g. abs(), min(), max(), gcd(), etc (currently whiley.lang.Math).

The big question is how to handle nested modules. Of couse, just this::is::nested is an easy approach. But, if we want to distinguish name spaces within modules it could be interesting. Perhaps, for exaple, std/java::Object. Hmmmmm, no that doesn't work. Perhaps std/java:Object is ok. Meaning that std::ascii becomes std:ascii. Hmmm, or std:java::Object.

DavePearce commented 7 years ago

Having now refactored the standard library roughly as above following RFC#0007, I'm starting to the think the syntax std/ascii:string is perhaps more preferable. Here, std/ascii is the module path, and string the named item (in this case a type declaration). Some notes

In general, you would e.g. import std/ascii and then refer to e.g. ascii:string which works nicely.
C++ and Rust, for example, don't have a strong notion of what it means to be inside a module. For C++, this is partly because of how #includes work. I'm not sure why they do this for Rust. If we followed Rust, for example, then we would import std::ascii and then refer to e.g. ascii::string. In Rust, however, when you import a module you import all the named items contained therein.

Needs some more thoughts.

DavePearce commented 7 years ago

Right, final thoughts now. Having spent a fair bit of time refactoring the standard library, I think this is the final word:

import std/ascii
import std/io

method main(ascii::string[] args):
   io::println("Hello World")

This seems to be a useful compromise. It provides both a strong notion of "inside" versus "outside" whilst also retaining the feel of existing systems languages like C++ and Rust. I believe it does look marginally better than this:

import std/ascii
import std/io

method main(ascii:string[] args):
   io:println("Hello World")

And the final alternative is the homogenous approach:

import std::ascii
import std::io

method main(ascii:string[] args):
   io:println("Hello World")

The next interesting question which should arise is what recommended naming conventions should be used. Looking at the Rust naming conventions is useful here. Currently, I definitely agree with:

Lower case for module names. Thus, 'std/io', 'std/ascii', etc. But, not e.g. 'std/ASCII'
CamelCase for compound types (i.e. records).

The real question is whether or not to go with snake_case for functions / methods. Currently, I've been following Java and using camelCase with lower case first. Thus, we have filesystem::open, ascii::toString(), etc. But, perhaps it should be ascii::to_string() ?

DavePearce commented 7 years ago

Ok, have written this up as an RFC now here:

https://github.com/Whiley/RFCs/pull/13

Therefore, am closing this issue.

Whiley / WhileyCompiler

Syntax for Fully Qualified Names #742