fmease / lushui

The reference compiler of the Lushui programming language
Apache License 2.0
5 stars 0 forks source link

Grand package system overhaul #123

Closed fmease closed 2 years ago

fmease commented 2 years ago

Meta: Task: Write description. Note: We want to partially copy Cabal's package system (however with explicit components (formerly known as crates or capsules) instead of a list of exposed modules). That is move away from Rust's package system: No "dev-dependencies" for example but component-specific dependencies. Also: Not just one library per package but several (with a "default" one) (see Cabal's sublibrary system, also this blog (a bit barren on the motivation though)).

The part of the package manifest describing components should be changed to look sth. like this:

components: [
    {
        type: "library",
        name: "hello",
        dependencies: {
            core: {},
        },
    },
    {
        type: "executable",
        name: "example",
        # dependencies: { "hello": {} } # this is implicit right now but once we allow multiple libraries per package,
        # this becomes problematic as it is not as flexible
        dependencies: {
            core: {},
        }
    }
],

Or with a "common" definition (terminology will probably change, this is straight up taken from Cabal):

components: [
    {
        use: "default",
        type: "library",
        name: "hello",
    },
    {
        use: "default",
        type: "executable",
        name: "example",
    }
],
commons: [
    {
        name: "default",
        dependencies: {
            core: {},
        }
    }
]

For the record, the equivalent to the above in the current version looks like this (although some stuff below could've been omitted for brevity):

library: { name: "hello" },
executables: [{ name: "example" }],
dependencies: {
    core: {},
},
fmease commented 2 years ago

I hope with allowing multiple libraries (which can be public or private to a package), we won't get in such an awkward position that we need Cargo's equivalent of workspaces and source/ directories full of packages instead of components. See rust-analyzer for example which has a crates/ folder technically containing packages not crates. This is not just a matter of definition, it forces the package authors to create a bunch of "dead"/vacuous package manifests which each specify the meaningless version 0.1.0 or 0.0.0 not sure which one.

Note however that Cabal (our new inspiration after Cargo) still employs/uses so-called project files cabal.project which are similar to Cargo workspaces I believe and which are used to manage compiler flags (optimizations, debug output, common conditional compilation, tools (docs, code coverage, …)) for several packages at once.

Maybe, we can cram everything into a package manifest allowing packages to contain packages or simply use a different mechanism for compiler flags. Hmm, I don't know much about this topic, I've never specified compiler flags (target.profile??) in a Cargo.toml etc. So let's see. This is not a pressing matter right now ^^.

fmease commented 2 years ago

Can one depend on (exposed) sublibraries of a package? Consider a package dependency alpha with exposed libraries alpha and beta. Then alpha is the default/primary library of the package since the component and the package share the same name.

Let's say one of our components wants to depend on the sublibrary beta contained within package alpha, expressed more concisely as we want to depend on alpha.beta (<package>.<library>).

dependencies: [
    beta: { name = "alpha.beta" },
    # for the sake of completeness and for further intuition/clarification:
    # alpha: {}, # means the same as
    # alpha: { name = "alpha" }, # means the same as
    # alpha: { name = "alpha.alpha" },
],

I am not sure about the notation of sublibraries: Should we use ., :, /, …?

Note for future self: If someone writes "alpha.beta": {}, we should offer a suggestion (obviously).

Implementation note: We should create a new type PackageName (which is basically ComponentName) for package names; it looks cleaner. Those "qualified" component names should be implemented as ComponentPaths which is basically { package: Option<PackageName>, component: ComponentName } (whose constructor immediately normalizes package-name = component-name to package-name = None).

fmease commented 2 years ago

How should we tackle "providers" (the resolution scheme of dependencies)? Cargo has the file system, git and "registries" (categorization correct?). Should we make this explicit? Currently it's "implicit" like Cargo's system: Specifying a key path makes the "provider" the filesystem, specifying nothing makes it use the default registry crates.io, using the key git sets the provider to "git". It's less structured imo but also more flexible e.g. allowing both filesystem and crates.io (afaik).

I am currently thinking of 4 right now:

Example of explicit provider specification:

dependencies: {
    core: { provider: "distribution" },
    something: { provider: "filesystem", path: "…" },
    json: { provider: "registry", registry: "…", version: "^2.3.1" },
    toml: { provider: "registry", version: "^1.0.55" },
    yaml: { provider: "git", url: "git://…", revision = "…", branch = "..." },
},

Implicit system:

fmease commented 2 years ago

In the new system, we should no longer make components implicit, i.e. we should no longer search for/try to discover components at specific locations if nothing is listed about them in the manifest (first executable at source/main.lushui, other executables at source/executables/*.lushui (we did not implement/design that anyway yet), library at source/library.lushui).

If they are absent in the manifest, they won't belong to the package. It feels better/less error-prone esp. if we go for the design where we list components in a list at field components. Feels weird to have a list which does not mention every component.

It becomes easier to add options for components, too, if there is already an entry. Most people should/are going to use lushui new/init anyways, so we can just generate that stuff without anybody needing to suffer.

fmease commented 2 years ago

Now that we want to support multiple libraries per crate, we should no longer implicitly add them to the list of dependencies of executables etc of the package. This allows us to define "package-local" "development dependencies" and "package-local" dependencies for "package-local" libraries. But how does this look like syntactically? Our current design of "providers" kind of forbids us to copy Cabal's approach where dependencies without a version specification always(?) refer to "package-local" libraries ("sublibraries"/"internal libraries" with Cabal's terminology) since "path dependencies"/"filesystem dependencies" don't need a version. And would dislike having the situation that mistyping the name of a "package-local" library may lead to the package manager downloading a package from a remote registry.

First design (too clunky/verbose): via the “provider system” (described in the comment chain above):

name: "alpha",
components: [
    {
        type: "library",
        dependencies: [
            "beta": { provider: "package" }, #  should we be able to say `name = "alpha.beta"` here? referencing `alpha`?
        ],
    },
    {
        type: "library",
        name: "beta", # alpha.beta
        dependencies: [
            core: { provider: "distribution" },
        ],
    }
],
fmease commented 2 years ago

Work started in 2360544279fb1676f88dedc3093b0c90f714fb4a.