Need Community Input (module / submodule import path syntax)

joe-conigliaro commented 5 years ago

Hi All, At the moment submodules use the dot to separate module paths:

import encoding.base64

Do you think the dot is okay or should we use something else like:

import encoding/base64
import encoding:base64 (example, could be anything)

Before it starts getting used we should decide on preferred style. @medvednikov this is probably your decision, sorry I forgot to ask before making the PR.

Currently imports work like this (nothing to do with submodules): If module A imports B and B import C:

A can access imports from B and C (and any imports from the files they import etc).
B can access imports from A and C (and any imports from the files they import etc).
C can access imports from A and B (and any imports from the files they import etc).

I have implemented scope to the imports now, so a module will only be available if the file explicitly imports it. I will make a PR for this soon, need to clean up a few things.

I also have implemented aliases, (will make a PR when finialised). We need to choose a syntax:

import module as a
import module ~ a (example, could be anything)
import module a

nedpals commented 5 years ago

I like the path structure for importing submodules. It's easy to map out.

iredmail commented 5 years ago

I prefer the dot one (encoding.base64).

Dot leaves more space than / and :, easier to recognize parent/sub module names.
Dot is used by other places like structs (Color.red), so that we don’t need to remember another different syntax.

Let’s make V syntax “boring”.

joe-conigliaro commented 5 years ago

@nedpals & @iredmail thanks for the input. I guess we will see what @medvednikov prefers.

aguspiza commented 5 years ago

You may want to import parent modules with .. from a subproject. So i think / should map better. This decision depends on how the compiler will allow module imports, right now only from vlib folder.

joe-conigliaro commented 5 years ago

@aguspiza a parent module import could be a later feature, it's nice idea but it adds complexity and the benefit isn't much, take for example: Inside file net/http/request.v, and you want to import net/url

import ../url

over

import net/url

Right now module depth is limited to 4 eg. module/sub1/sub2/sub3

I guess it might be nice to have the option

fuzzy commented 5 years ago

I generally prefer the dot notation, but I think the path separator arguments presented by @aguspiza are pretty logical and I see them as making it visually less onerous to do relative imports, rather than having a special or secondary case for doing the same thing. If one syntax can cover relative and "non" relative imports (they're all relative to vlib at the least), I'd say that 's the way personally. Just my 2 cents.

joe-conigliaro commented 5 years ago

I haven't had much time but I've been playing around a bit more, I was able to implement parent imports as suggested and aliasing eg, import some/package as pkga. I need more time to research though as i only just realized: Lets say you compile a.v and it includes os & json, then in the json module because os was already imported, it's already accessable in json there is no scope to the imports once they are imported any files parsed after that will have that import name available.

MaD70 commented 5 years ago

@joe-conigliaro the parent modules import feature doesn't seem a good idea to me. The layered architecture presented in Parnas' classic Designing software for ease of extension and contraction makes for easier to understand software.

Without parents import, the layered architecture is easy to implement within V, where a module is confined to a directory: leaf directories are the lower layer (they are standalone, i.e. they use language features only) and each layer on top of that can import from lower layers but not from a) the same layer or b) upper layers. With parents import this organization is no more respected by construction and I think the compiler needs checks to avoid cyclic dependencies.

What people and @medvednikov think? It's an important language design decision.

Aliasing is good to avoid to carry around long some/package identifiers. What I would like to see is an export section pairing import on top of a module, instead of pub scattered around all module code.

P.S.: I see that there is a documentation generation flag (not yet implemented) in the compiler source. If this is the intended way that the public interface of a module is automatically extracted (like in Oberon, IIRC) then the export section is not necessary.

joe-conigliaro commented 5 years ago

thank @MaD70 the parent imports thing is no biggie it, I was just experimenting, I have also tested an implementation where the imports are scoped to the file that imports them, as apposed to the current way where once something is imported it is available to any file parsed afterwards. While doing these experiments I have discovered many bugs which cause segfaults. It's actually very hard to make progress because I cant do so many things, so I think most importantly we need to fix a lot of these bugs as well.

MaD70 commented 5 years ago

With

as apposed to the current way where once something is imported it is available to any file parsed afterwards

do you mean that if A import B and B import C, then everything exported from C is visible to A? Or are you speaking about submodules?

joe-conigliaro commented 5 years ago

Nothing to do with submodules. Actually Yes. Any imports will be available in all files currently. A can access imports from B, and C (and any imports from the files they import etc). B can access imports from A and C (and any imports from the files they import etc). C can access imports from A, and B (and any imports from the files they import etc).

MaD70 commented 5 years ago

Ouch! I think this is a bug, not a feature: a module system that leaks identifiers doesn't make much sense to me.

aguspiza commented 5 years ago

import with ".." is not about importing from parent folders but from siblings folders. Specially if you want to avoid "project" files like CMakeLists.txt, qmake .pro, etc. Anyway this is related to project struture and the ability to build a .v file from any place in a project folder structure. In a usual structure like:

└─ myproject
   ├── src
   │   ├── main.v
   │   └── vmodules
   │   └── ymodules
   │   └── coremodules
   │   └── mymodule.v
   │   └── mymodule_test.v
   └── tests
       └── helloworld.v

How should helloworld.v reference the modules in src? How should vmodules and ymodules reference the modules in coremodules? Do we want to force a specific project structure?

joe-conigliaro commented 5 years ago

@MaD70 Thats why I was implementing scope to the imports, I dont think its a bug, more just something which wasn't implemented yet but looking at the code

joe-conigliaro commented 5 years ago

@aguspiza well actually would let you import from siblings and parents. The only thing is.. in your example is my project a module itself? because if it is then the imports can also be accessed like: import myproject/ymodules/ymodule It's still not well defined how non core modules will work though, this needs to be worked out also, as well a manager.

joe-conigliaro commented 5 years ago

Ok, so i have scoped imports and aliases working, what syntax should we use to alias imports?

import module as a
import module ~ a (example, could be anything)
import module a

fuzzy commented 5 years ago

Being a long time python guy, I'm drawn to 'as'. But I think I vote for the final and more terse form

joe-conigliaro commented 5 years ago

@medvednikov what do you think?

nedpals commented 5 years ago

@joe-conigliaro Coming from JS and little experience with Python, I think the as keyword makes much more sense when aliasing module imports.

medvednikov commented 5 years ago

Sorry for being late :)

This is an important discussion.

I like import foo.bar syntax, that's why I merged it and cleaned it up a bit just now.

I still need to figure out the structure of modules, and where they will be stored, as well as the VROOT issue. I'll do it this week.

I really like Go's approach, so I think it will be similar. Except instead of

import "github.com/user/module"

we will have

import user.module

Relative imports should not be used. There must be only one way of doing imports, just like with anything in V.

medvednikov commented 5 years ago

As for aliasing, I don't see a reason to it differently from Go:

import foo my.long.module

This is also similar to

type myint int

joe-conigliaro commented 5 years ago

@medvednikov Thanks it is an important discussion :) Thanks for cleaning up the code.

MaD70 commented 5 years ago

@aguspiza I think I understand your concerns better now and it's an important point you are raising. Let me articulate it differently, to test if this is true (correct me if I'm wrong).

Your main concern here is: how to share modules across projects?

Of course, we don't want to copy a module in each project that needs it. I agree that is better to avoid makefiles, project files and so on. In your specific example, with the current state of the compiler, I would opt to a single main_tests.v in src and a generic test infrastructure in v/vlib or v/thirdparty, but this doesn't answer your question.

At this point the only way to share modules across projects is to move them to v/vlib (generic modules, potentially useful to a lot of different program types) or to v/thirdparty (if I understand correctly the intended use of this directory, less generally useful modules).

We could have:

a third special directory for modules shared across projects (by a single programmer or organization), or
a compiler flag to instruct the compiler to search for additional modules, used by the current program under compilation, starting from a specific root directory;
[update] limited to module main only, do something similar to the uses clause in Delphi:
```
import f first
import s second
import t third 'C:\projects\shared_modules\'
```
Option 3) is like a file system symbolic link, but implemented into the language (so it's portable). Its intent is to let the program access shared modules, via the submodule feature, starting from the specified directory.

Options 2) and 3) not only make sharing modules between projects easy but also, from the same project, testing modules exporting exactly the same interface but implementing the same functionalities differently or different releases of the same module.

These are just the first ideas that came to my mind. I think this problem should merit more attention, we should consider pros and cons of each proposed solution (perhaps a quick exploration of the comp sci literature related to modularization would be beneficial; I'm starting from this Google search software modularization survey but I cannot promise that I will devote much time to it).

joe-conigliaro commented 5 years ago

Here is a PR with updates: https://github.com/vlang/v/pull/1084

MaD70 commented 5 years ago

Module Systems survey: preliminary shallow report

Introduction

My hope was to find a survey of modularity mechanisms, or at least about module systems, in programming languages; a presentation of various concepts of "module" and what problems they intend to solve.

A web search wasn't much productive. I also searched the ACM Computing Surveys (CSUR) site but it seems such survey doesn't exist.

Then I stumbled upon this thread on Lambda the Ultimate (LtU), a forum for programming language (PL) researchers: I Seek a Reasonable Survey on the Concept of "Module System". In a message someone noted that it's too big a topic.

Apparently what's only left to do is to examine the most representative modules systems one by one. I haven't time to study each module system thoroughly, so the shallow in the title. I hope this quick report will be useful nonetheless.

Some module systems

One thing notable from the LtU thread is pages 11-12 of these slides from an invited talk, Advanced Module Systems: A Guide for the Perplexed (.ps) (.html):

A puzzle

Recent academic languages (SML, OCaml, MzScheme, etc.) offer complex module features [functors, sharing specifications, H-O / applicative / generative...], plus claims that their features are needed to build large software systems.

Most production languages (C, C++, Java, etc.) provide very simple module systems... and are believed to “work pretty well” for building large software systems.

So: Who is “right”? Or: better question...

What pragmatic issues motivate the features of advanced module systems?

When do we really need which features?

I think the powerful module system of the ML family of languages doesn't blend well with V language philosophy of simplicity: it has a sub-language devoted to modules, with module values, module types and functions on modules (computed at compile-time, see for example A Crash Course on ML Modules). So a module could be parameterized.

Another sophisticated module system is that of Scheme, in particular that introduced(?) with The Revised⁶ Report (R6RS), in Chapter 7, Libraries. This seems more pragmatic even if a bit overkill too (but remember that Scheme has a sophisticated macro system and this interacts with modules; V doesn't have macros).

The module system of Modula 3 is quite simple if one excludes generics, see Compilation Units: Modules and Interfaces and 2.5 Modules and interfaces from Modula-3: Language definition (.pdf). It has IMPORT and EXPORT sections and the possibility to import from a module a subset of exported elements, with FROM. Interfaces are akin, but more formal, to C header files (.h). Complication is introduced with Generics, where generic modules/interfaces have parameters.

About parameterization, again from Advanced Module Systems: A Guide for the Perplexed (.ps) (.html), p. 63:

Difficulties

Two forms of sharing by parameterization:

Parameterization over external modules (as above) Observation [MacQueen]: As dependency hierarchies become deeper, interfaces parameterized on modules scale badly.

Parameterization just over the abstract types from external modules (e.g., Haskell) Works.

From what I understand of them, V's generics seems to correspond to the second case.

TODO

Haskell
Ada(?)
Piccola
Architectures, components, connectors and so on(?)
…
A more fundamental question: Why do we need modules at all? by Joe Armstrong (Erlang). See related discussion on LtU.

Final note

We have to think about the interaction between V's structures and modules, which is analogous to the interaction between classes and modules in OO languages.

Suggestions and corrections are welcome.

medvednikov commented 5 years ago

import my.module as alias has been merged.

@MaD70 V modules are very simple, similar to Go's modules. They are just containers for types and consts. There's no exporting, all objects have to be accessed with a full path (mymodule.function()).

This simplifies things greatly.

MaD70 commented 5 years ago

@medvednikov perhaps there is misunderstanding here. The export clause in other programming language (PL) module systems has the same role as pub for V modules, from what I understand reading V documentation on modules:

// To export a function we have to use `pub`
pub fn say_hi() {
  println('hello from mymodule!')
}

In other PLs, instead of having them scattered throughout the source code, they are on top with imports. But I read you opted to not allow the shortened import A, B, C, … syntax, so adding also a long list of export clauses doesn't make sense.

P.S.: what's your stance on the sharing of modules across projects?

medvednikov commented 5 years ago

@MaD70 ok I see.

Yes, exporting is done with pub.

what's your stance on the sharing of modules across projects?

There will be a centralized modules repository like Ruby Gems. I'll launch it this month.

vlang / v