microsoft / qsharp

Azure Quantum Development Kit, including the Q# programming language, resource estimator, and Quantum Katas
https://microsoft.github.io/qsharp/
MIT License
367 stars 73 forks source link

Hierarchical Namespaces #493

Open bamarsha opened 11 months ago

bamarsha commented 11 months ago

Namespaces don't make sense in Q#. Their only benefit is in organization and avoiding name conflicts, but they don't help you define abstractions or use encapsulation/information hiding. A module system does a better job at organization than namespaces and is also a good unit of abstraction. Namespaces should be deprecated and replaced with modules.

sezna commented 10 months ago

Proposal: A Q# program is either an executable or a library. This is determined by the presence of either a Main.qs or a Lib.qs. In an executable, either a Main.qs has top-level statements, or (that is, XOR), at least one @EntryPoint() in the compilation unit/package.

Modules will be declared by the keyword module. The qsc binary will be responsible for understanding module declarations, finding the corresponding files, and including them in the module tree.

The external package keyword pair will be used to define external packages. For now, we will support simple HTTP dependencies, expecting Q# files on the other end. These packages will be included as sibling to the current compilation unit or package. The package keyword will be, similar to Rust's crate, used to denote a module that is a part of the current package. See the below synax example:

external package "https://github.com/microsoft/qsharp/blob/main/library/std/intrinsic.qs" as Foo;

// importing from an external package
import Foo.bar;

// importing from a module that is declared in `Main.qs` or `Lib.qs`
import package.baz;

We will use . as the separator for modules. All module items will be private by default, and we will introduce an export keyword.

Put another way, we will basically adopt the Rust module system design, but we will use export instead of pub, package instead of crate, import instead of use, . instead of ::, and module instead of mod.

sezna commented 10 months ago

A couple more decisions have been made:

import package sounds better than external package, so let's use that.

All import package statements should be centrally located, likely in Main.qs. We could potentially just have a syntax element in which all import package statements must be contained. This syntax element, which we were calling the "manifest block" could eventually become a manifest file. If we support manifest declarations in the program as well as in a manifest file, we would support both the single file experience and the multi file experience.

edit: the above comment is outdated, as we now have a manifest where we can define package imports.

sezna commented 3 months ago

Ok, there have been more developments, and here is the latest design:

Hierarchical Namespaces and Packages in Q

Summary

Q# is moving to hierarchical namespaces in preparation for supporting package publishing.

Terminology

The current system of Q# namespaces, where all namespaces are top-level identifiers associated with a collection of items, are called flat namespaces. The system we are moving to, which supports nested and implicit namespaces, will be called hierarchical namespaces.

Motivation

The movement from flat namespaces to hierarchical namespaces is motivated by our packaging story. Packaging is critical to the future of Q#, as it enables us to drive community growth, and better support our third-party partners. Publishing a library is one of the most basic and rewarding aspects of participating in a programming community. By encouraging users to become library authors, we are enabling them to take some ownership over the future of Q#. Enabling our hardware and software partners to author their own libraries for to highlight features or provide compatibility with their quantum products will also be a key part of Q# strategy moving forward. For example, libraries providing abstractions over provider-specific capabilities.

To support packages, we need a controlled and ergonomic way to define a package’s API. Given a Q# project, what items within it are part of its public interface? The current operation of flat namespaces in Q# has a few critical issues. The fact that items are public by default and namespaces have no hierarchy or tree structure means that naming conflicts will be likely. Additionally, it is not clear what items will become a part of your package’s API, as there are no explicit means of exporting (everything is public by default). Due to these reasons, as well as other technical and ergonomic issues, it is necessary to introduce hierarchical namespaces to Q# before we introduce package publishing.

Differences Between Hierarchical and Flat Namespaces

  1. Implicit Namespace Scoping Hierarchical namespaces do not require a namespace {} block around code. Instead, Q# code can be written directly in a file. The path to the file, with respect to the Q# project, will determine the resulting namespace name. For example, a file at <project>/src/Foo/Bar/Baz.qs will define its items as members of the Foo.Bar.Baz hierarchical namespace.
  2. Explicit Namespace Declarations For backwards compatibility, existing namespace declarations will still work. However, their semantics will change. In the current, flat namespace Q# design, the namespace declaration namespace Foo.Bar {} declares a top-level flat namespace called “Foo.Bar”. In hierarchical namespaces, the above statement will declare a top-level namespace called Foo, and another namespace within Foo called Bar.
  3. import vs open Currently, namespaces are opened with the open keyword. This will continue to work. Hierarchical namespaces will support import statements, to import specific items from namespaces. An open statement is semantically identical to a glob import. For example, open Foo.Bar is isomorphic to import Foo.Bar.*. Open statements will be gradually deprecated as we move to exclusively hierarchical namespaces over the next couple of major versions.
  4. Internal-by-default semantics All namespaces will have internal-by-default semantics. What this means is that all items within hierarchical namespaces are internal to the package itself. However, for an item to be exported outside of the package, it must be explicitly exported.
  5. export statements An export statement will explicitly declare an item as being exported publicly, for consumers of a package to import. Export statements declare an item is exported from its current location within the namespace hierarchy. For example, an export statement export { Baz } from src/Foo/Bar.qs will show up in the public interface as <package>.Foo.Bar.Baz.
  6. Main.qs convention Projects and packages currently do not need a Main.qs file. This will continue to be true, but we will introduce a convention where Main.qs is treated as the root of the namespace hierarchy, if it exists. This allows for finer control of the public interface. For example, consider the following folder structure:
    MyQuantumLibrary/
    qsharp.json
    src/
    Foo.qs
    Bar.qs

    Anything exported from Foo.qs will show up in the hierarchical namespace MyQuantumLibrary.Foo.. Likewise, anything from Bar.qs will show up as MyQuantumLibrary.Bar.. But what if you want to expose an item as MyQuantumLibrary.? In the current design of the language, there is no root of the project, so there is no way to do this. To fix this, we will introduce the notion of a Main.qs file.

    MyQuantumLibrary/
    qsharp.json
    src/
    Foo.qs
    Bar.qs
    Main.qs

    Now, if we export something from Main.qs, it will show up in the public interface as MyQuantumLibrary.<item>.

  7. Namespace Declarations in Hierarchical Namespaces We will support nested namespaces using the existing syntax. Consider the following namespace declaration:
    namespace Microsoft.Quantum.Test {}

    This namespace, in Q# V1.0, declares a single top-level namespace called Microsoft.Quantum.Test – there is no nesting, the periods are just part of the identifier. In the new semantics, this would be syntactic sugar for the following:

    namespace Microsoft {
    namespace Quantum {
        namespace Test {}
    }
    }

    This allows existing namespace declarations to continue working, while also introducing nestable namespaces, accomplishing our objective of minimizing hard breaking changes. Note that for backwards compatibility to work, all explicitly declared namespaces will have to be declared at the root level (in other words, namespace declarations use absolute namespace names, not relative). This means that a namespace Microsoft.Quantum.Test declared in Project/src/Foo/Bar.qs will still be positioned in the namespace hierarchy as ProjectRoot/Microsoft.Quantum.Test.

Backwards Compatibility

While the semantics will change, we aim for hierarchical namespaces to be backwards-compatible with flat namespaces. To this end, we will continue to support open and namespace declarations in the ways mentioned above, as well as the existing format of the standard library.

The Standard Library and Re-exports

Hierarchical namespaces will support re-exporting items. By importing, and then exporting, a symbol, we can perform re-exports to organize library interfaces. This will function in the usual way that re-exports do in other languages, like Rust and Typescript. This allows us to provide two interfaces for the standard library: the old, open-statement based style; and the new, import-based standard library:

// in stdlib/src/Main.qs
// we can put things we want in Std.* here
open Microsoft.Quantum.Diagnostics;
export { DumpMachine, /* whatever we want to export */ }

// in stdlib/src/Canon.qs
// we can put things we want in Std.Canon here
export { /* */ }

To facilitate easily re-exporting a namespace, we may want to introduce wildcard exports. This is not certain yet, though, as that could make it difficult to track where symbols are coming from. The above structure would allow users to import the standard library either via open Microsoft.Quantum.Canon or import Std.Canon.*. As we deprecate flat namespaces, we will deprecate usage of the flat-namespace standard library API.

bamarsha commented 3 months ago

I don't think flat vs. hierarchical is the main problem with Q#'s namespace system. For me, the major issue is that namespaces are open instead of closed, meaning that a namespace can be defined multiple times, and the items in the namespace are the union of the items in all partial namespaces with the same name. This is related to internal being the lowest visibility in Q#, which is too coarse. A private visibility is needed but can't be meaningfully defined without a closed unit of abstraction.

I didn't see this mentioned explicitly in the new design, but it seems like because there is a one-to-one mapping from files to namespaces, namespaces are implicitly closed because there can only be one file at a given path. But it looks like this is circumvented by explicit namespace definitions which are still supported. I would argue that private by default is much better than internal by default so while this is an improvement, I don't think it goes far enough.

While making each file implicitly declare its own namespace is a good idea, it looks like this design also has each file be at most one namespace, because explicit namespace definitions are treated as a backwards compatibility feature. This seems like an unnecessary limitation. It's another factor that encourages too coarse-grained units of abstraction, because it's easier to put more definitions in the same namespace than to create a new file.

I'm also wondering what is truly hierarchical about this design? For example, does Foo/Bar.qs in namespace Foo.Bar automatically import items from Foo.qs in namespace Foo? Can Foo.Bar import from Foo.Baz using a relative path like super.Baz? Is there a visibility modifier for items that can be seen by parent or child namespaces but not by sibling namespaces? I didn't see any of these mentioned.

sezna commented 3 months ago

I'm also wondering what is truly hierarchical about this design? For example, does Foo/Bar.qs in namespace Foo.Bar automatically import items from Foo.qs in namespace Foo? Can Foo.Bar import from Foo.Baz using a relative path like super.Baz? Is there a visibility modifier for items that can be seen by parent or child namespaces but not by sibling namespaces? I didn't see any of these mentioned.

It should support semantically nested namespaces, so, e.g., if you have namespace Foo.Bar with item Baz, import Foo.Bar gives you access to Bar.Baz, and import Foo.Bar.Baz gives you access to Baz directly. The semantics are very similar to Rust's module import semantics.

Also, right now, a namespace can technically contain a namespace in HIR. The rest of the compiler isn't capable of handling it yet, though. In HIR, Namespace items (in HIR's ItemKind) contain a vector of LocalItemIds. LocalItemIds are mapped to Items in qsc_hir/hir.rs -- the struct Package contains it in its field items. Items contain an ItemKind, which can be another namespace. So, we can already support nested namespaces in some ways. My full exploratory notes (pretty unpolished, sorry) are here.

I don't think flat vs. hierarchical is the main problem with Q#'s namespace system. For me, the major issue is that namespaces are open instead of closed, meaning that a namespace can be defined multiple times, and the items in the namespace are the union of the items in all partial namespaces with the same name. This is related to internal being the lowest visibility in Q#, which is too coarse. A private visibility is needed but can't be meaningfully defined without a closed unit of abstraction.

When you say private, do you mean the same as Rust's private where it is entirely internal to a module? We were thinking that that is useful as a keyword in the future, but the default would be internal. This is because our audience likely doesn't want to deal with constructing a tree of public items that eventually reach the project root to define their public API, like how Rust does it.

I didn't see this mentioned explicitly in the new design, but it seems like because there is a one-to-one mapping from files to namespaces, namespaces are implicitly closed because there can only be one file at a given path. But it looks like this is circumvented by explicit namespace definitions which are still supported.

Yes, unfortunately there will be a middle state where namespaces can still be re-opened. This is a compromise made for backwards compatibility. We do hope to deprecate the old syntax as LLMs, documentation, and existing projects migrate to the new system. In the future, this allows us to deprecate namespace __ {} items entirely, effectively making namespaces closed.

sezna commented 2 months ago

Tracking: