Whiley / WhileyCompiler

The Whiley Compiler (WyC)
http://whiley.org
Apache License 2.0
217 stars 36 forks source link

Syntax for Fully Qualified Names #742

Closed DavePearce closed 7 years ago

DavePearce commented 7 years ago

Currently, the mechanism for representing a fully qualified name is based on that of Java. Namely, it uses the "dot" syntax whereby we have names such as whiley.lang.Int and whiley.lang.Int.u8. One difficulty with this syntax is that we cannot distinguish paths from names:

There are a number of ways in which we could syntactically distinguish paths from names. Some examples:

The latter is perhaps somewhat closer to Rust's name syntax.

There are some implications here. In particular, it means there is a strong distinction between a file and a potentially nested namespace.

bambinella commented 7 years ago

I find the C++ "::" notation to be noisy. I think Ada is more clean-looking (used for something else in Ada, but...):

whiley::lang::Int.something

vs.

whiley'lang'Int.something

(In Ada I think it is something like name1.name2'attribute which isn't really the same)

I am thinking that the mnemonic would be that ' would signify genitive, e.g. «Dave's». Anyway, this is what I've landed on previously for my own "on paper syntax" design. But it is one of those cases where you end up with all alternatives being so-so-not-so-great.

My thinking though is that the namespace part should be more visually contracted to indicate stronger connection as a unit than the distinct objects which should be more paced out.

"::" does the wrong thing there. It introduces too much separation between the path elements, so the whole namespace path cannot be digested as a single unit by our visual system.

When programs grow I tend to use fully qualified names all the way. These days I never import the std namespace in C++ for instance. So my code becomes very cluttered with the "::" syntax, e.g.:

mylib::math::somefunction( std::begin(c), std::end(c) )

vs

mylib'math'somefunction( std'begin(c), std'end(c) )

I don't know... :-|

But what if Whiley introduces standardized methods/operators with prescribed properties from namespaces like «statistical»:

db::getobject(database,key).statistical::mean()

vs

db'getobject(database,key).statistical'mean()

I think the one with ' looks better in this case.

DavePearce commented 7 years ago

Currently, I think I'm now leaning towards the C++ / Rust style. One reason for this is simply that Whiley wants to compete in this space. This would also mean a flatter hierarchy for the standard library. For example, we'd have something like this:

The big question is how to handle nested modules. Of couse, just this::is::nested is an easy approach. But, if we want to distinguish name spaces within modules it could be interesting. Perhaps, for exaple, std/java::Object. Hmmmmm, no that doesn't work. Perhaps std/java:Object is ok. Meaning that std::ascii becomes std:ascii. Hmmm, or std:java::Object.

DavePearce commented 7 years ago

Having now refactored the standard library roughly as above following RFC#0007, I'm starting to the think the syntax std/ascii:string is perhaps more preferable. Here, std/ascii is the module path, and string the named item (in this case a type declaration). Some notes

Needs some more thoughts.

DavePearce commented 7 years ago

Right, final thoughts now. Having spent a fair bit of time refactoring the standard library, I think this is the final word:

import std/ascii
import std/io

method main(ascii::string[] args):
   io::println("Hello World")

This seems to be a useful compromise. It provides both a strong notion of "inside" versus "outside" whilst also retaining the feel of existing systems languages like C++ and Rust. I believe it does look marginally better than this:

import std/ascii
import std/io

method main(ascii:string[] args):
   io:println("Hello World")

And the final alternative is the homogenous approach:

import std::ascii
import std::io

method main(ascii:string[] args):
   io:println("Hello World")

The next interesting question which should arise is what recommended naming conventions should be used. Looking at the Rust naming conventions is useful here. Currently, I definitely agree with:

  1. Lower case for module names. Thus, 'std/io', 'std/ascii', etc. But, not e.g. 'std/ASCII'
  2. CamelCase for compound types (i.e. records).

The real question is whether or not to go with snake_case for functions / methods. Currently, I've been following Java and using camelCase with lower case first. Thus, we have filesystem::open, ascii::toString(), etc. But, perhaps it should be ascii::to_string() ?

DavePearce commented 7 years ago

Ok, have written this up as an RFC now here:

https://github.com/Whiley/RFCs/pull/13

Therefore, am closing this issue.