fortran-lang / fpm

Fortran Package Manager (fpm)
https://fpm.fortran-lang.org
MIT License
868 stars 97 forks source link

Multiple libraries in a package #524

Open bcornille opened 3 years ago

bcornille commented 3 years ago

Would it be possible to extend the package_config_t to create multiple libraries? This is useful when you have multiple executables which have some common module dependencies and others distinct. These internal libraries could also depend on others. Presumably this would affect the TOML specification and library_config_t.

awvwgk commented 3 years ago

We are currently not planning to support multiple library targets in a project. It doesn't mean you can't create your described setup with fpm. There are two options here:

  1. you have modules that are shared between executables, examples or tests, than those can be placed inside the respective directory (app, example or test) and are automatically shared between the different targets. Those files won't be available for downstream projects which depend on your project.
  2. you can setup multiple fpm projects to implement the respective features in separate libraries and reuse them to build your application. In case you can setup your project more modular this is the preferred way, as all libraries can be reused individually.
bcornille commented 3 years ago

What is the rationale for only supporting a single library target in a project?

rouson commented 2 years ago

I too am curious about the rationale for only supporting a single library target. @awvwgk is item 2 in your comment the answer? Setting up separate projects for this purpose seems like heavy lift -- especially if "separate project" means separate repositories, which then means separate documentation, separate unit tests, separate CI tests, etc. Furthermore, does the statement "all libraries can be reused individually" imply that there are no dependencies between libraries? If there are dependencies between the various libraries that a project builds, then it makes sense to keep them in one project. Even when there are no dependencies, it can make sense to keep them together if each serves a common purposes, e.g., each supports one interface in a different way.

awvwgk commented 2 years ago

It is a deliberate design decision, think of it as a way to limit the complexity of a project and also the required knowledge when using a project as dependency.

Is there a specific case you want to realize with fpm requiring the creation of a mono-repository with multiple projects?

LKedward commented 2 years ago

... which then means separate documentation, separate unit tests, separate CI tests, etc.

I think this is a good thing. If anything I want more modularity with regards to testing and documentation to make clear the compartmentalisation of functionality and interfaces between different libraries. I find this makes writing tests easier for developers and reading documentation easier for users compared to when dealing with large mono-repos.

Furthermore, does the statement "all libraries can be reused individually" imply that there are no dependencies between libraries?

Not at all, rather I think it comes back to modularity again. If A depends on B unidirectionally, then B can still be reused without need for A.

As Sebastian has stated above, I'm not aware that fpm actually restricts any of the use cases described above. I too would be interested to hear more about a specific use case where fpm is currently not well-suited.

bcornille commented 2 years ago

I work on a project that has several orthogonal libraries that exist in the same repository. There is also a small hierarchy of interdependency. This way we have testing for each library independently. We build applications that depend on all libraries and some utilities that depend on subsets. Splitting the libraries into separate repositories would increase maintenance burden since each is on the smaller side and there are a few. The other option of combining everything into one library violates the principle of separate concerns.

Since fpm is opinionated in this way it precludes consideration for adoption. Even if our setup is viewed as bad practice, it is not going to change. In my opinion, if it is not a severe technical limitation, this is a unnecessary stance for fpm to take and is unsymmetric with it's support for multiple apps in one project.

rouson commented 2 years ago

I'm working with collaborators on a possible successor to OpenCoarrays, which is written primarily in C. The successor will be written as much as possible in Fortran and is already using fpm to build. Having multiple libraries that support one interface is one of the central aims of OpenCoarrays and its successor. The unified interface is what makes the different libraries swappable so parallel Fortran 2018 source code can be compiled into an object file once and then linked or relinked to any one of several backends, e.g., one that supports communication via MPI, another that supports communication via OpenSHMEM, etc. Splitting OpenCoarrays into separate projects would violate the basic purpose of the OpenCoarrays project in addition to adding all the previously mentioned burdens of maintaining separate repositories with multiple copies of the very same CI tests, documentation, Wiki, etc.

awvwgk commented 2 years ago

Thanks for the insights, those sound like valid use cases. Let's discuss how we can make this possible (cc @fortran-lang/fpm).


Let's explore how we could make this possible in the package manifest.

Proposal

1. Special syntax for a single executable:

The executable array of tables with a single executable is currently given as:

name = "lib"

[[executable]]
name = "lib"
source-dir = "app"

A possible equivalent syntax for a project with a single executable with the same name as the project could be:

name = "lib"

[executable]
source-dir = "app"

2. New syntax for multiple libraries

The current syntax for the library table:

name = "lib"

[library]
source-dir = "src"

This could also be expressed as an array with a single table using the project name:

name = "lib"

[[library]]
name = "lib"
source-dir = "src"

With this change we can keep the current package manifest syntax and easily make multiple libraries available.

rouson commented 2 years ago

@awvwgk thanks for considering this. I'm a minimalist so I generally put everything in default locations and most of my fpm manifests are 5-10 lines. Therefore I'm not very familiar with any of the syntax above and have no opinion other than that backward compatibility is one of the main reasons Fortran has survived ~64 years. Learning from that lesson, I think doing something that preserves the current package manifest syntax is a great thing.

awvwgk commented 2 years ago

The other issue is handling project with multiple libraries.

For now we could require explicit declaration of all libraries in a project, the directory trees of two libraries must not overlap, i.e. a library source-dir cannot be a subdirectory of another library source-dir.

For projects with a dependency providing multiple libraries we can start by assuming all will be made available by default. Than we need additional syntax to narrow the available libraries (suggestions?).

certik commented 2 years ago

How does Cargo do it? Can somebody investigate and post a summary? If they don't need it, then maybe we don't need it either. Although I am not against.

I noticed that for example the Rust compiler has a bunch of independent Cargo libraries that compose it:

https://github.com/rust-lang/rust/tree/master/compiler

Each directory there is a Cargo package. If you click on one, you see how they specify the dependencies between these packages: https://github.com/rust-lang/rust/blob/a8f6e614f86be429b5862f30e023063f619aeed2/compiler/rustc_borrowck/Cargo.toml

Is that the best possible design? I don't have an opinion yet.

awvwgk commented 2 years ago

The strategy with the Rust compiler and Cargo seems to be having multiple project in subdirectories and declare dependencies using relative paths. We can already do this with fpm, if there is at least one meta-package in the project root, which depends on all subpackages in the project. However, we don't have a way to access an fpm project which is not specified in the project root (yet). Depending on the meta-package would make all subpackages available.

This seems like a possible approach for this setup and is already possible now. Is this a viable setup and should we focus on improving fpm in this direction?

certik commented 2 years ago

This seems like a possible approach for this setup and is already possible now. Is this a viable setup

From Cargo's experience I would say it's definitely viable.

and should we focus on improving fpm in this direction?

Given how close we are, it seems it would be worth it to finish it to make it work.

However, is there some other design that is better? The only other design I can think of is to produce multiple libraries per package, as discussed in this issue.

How much would that complicate the fpm design to allow multiple libraries? If it is a big complication, perhaps we don't need it and we can just use the Rust's approach.

awvwgk commented 2 years ago

From Cargo's experience I would say it's definitely viable.

We have discarded constraints present in Cargo in the past, e.g. having the link field restricted to a single library. Also, in this case the workaround would have been to declare multiple local subpackages which each link against one library and depend on each other in the order of the link line. Not sure if this approach is used in any Rust project, but I found the idea impractical.

Local subpackages seem to be Cargo's way to work around deliberate restrictions of the package manifest. The question is whether this way is convenient enough in practice for our users.

rouson commented 2 years ago

Earlier in this thread, I mentioned wanting to produce multiple libraries from one project for my work on a successor to OpenCoarrays. The aforementioned package is now open-source and had its first release earlier this month: Caffeine. It currently uses [GASNet-EX] as the back end communication software with plans to add alternative MPI and OpenSHMEM back ends. From last year's discussion, it seemed that fpm already had some of the basic infrastructure to satisfy the request described in this issue. Does anyone know whether the ability to produce to multiple libraries will appear in an upcoming release?

awvwgk commented 2 years ago

Contributor time is always the constraint for implementing new features. I started working introducing an [executable] table beside the [[executable]] array of tables a while ago and noticed some issues which make this change somewhat non-straight-forward. I didn't really continue to explore it for the [[library]] array of tables further.