fortran-lang / fpm

Fortran Package Manager (fpm)
https://fpm.fortran-lang.org
MIT License
883 stars 99 forks source link

Hosting of packages #4

Open certik opened 4 years ago

certik commented 4 years ago

Eventually we need to have a central place for packages similar to crates.io.

But for now we will use a git repository (GitHub, GitLab and other places will work) as well as just url for a tarball. That way we don't need to host anything ourselves at first and can get the initial community and ecosystem of packages built up without worrying about security and other issues that will come with maintaining our own repository.

milancurcic commented 4 years ago

Although worthwhile thinking ahead, I think we are far from this. It's a big technical challenge that requires dedicated hardware and person (people) if it's to work smoothly.

In the interim, as you describe, we can maintain a registry that provides all the info about available packages that fpm needs, but the source code of packages is hosted wherever its hosted by the package maintainers (GitHub, GitLab, custom url, whatever). The downside to this approach is that if the maintainer takes down the package, or changes the url, or GitHub is down, the package is unavailable through fpm. I think these are edge cases that we can live with and work around for a while, especially considering that Fortran's ecosystem is still fledgling.

Let's discuss what would the registry look like. How about if we maintain a registry of fpm.toml files for each supported package in fpm's repo. Something like this:

fpm/
  Registry/
    blas/
      fpm.toml
    lapack/
      fpm.toml
    stdlib/
      fpm.toml
    ...

fpm.toml for a package includes all the info that fpm would need to build the package, including but not limited to:

For a maintainer to add their package to fpm, they would open a PR in fpm to add their fpm.toml to the fpm Registry.

Am I going in the right direction?

certik commented 4 years ago

Actually, the registry would be just a list of urls to download the package, so:

https://github.com/certik/lapack.fpm
https://github.com/fortran-lang/stdlib
...

Each of these urls will be either a tarball, or a git repository. When you download it, it contains the fpm.toml file with all the metadata. And we'll have code that will simply download each package and extract the metadata to create a nice (static) webpage and to allow to search from a command line (fpm search) --- we can automatically prepare some JSON file with package name / description / url, etc., and host it at some github repo, and fpm would simply download it. (The registry might be a combination of version + url, because a single package can have multiple verisons, so one would use, e.g., git tags for different versions.)

Regarding the design of fpm, I would do exactly what Rust does. So a standard layout (which however can be disabled if you don't like it from Cargo.toml), pure Rust is automatic, non-Rust parts are compiled by hand by writing a build script (and listing it in Cargo.toml, then Cargo executes it before building Rust parts). From the build script you can call cmake or whatever build system one wants.

milancurcic commented 4 years ago

Can you explain why you need a separate (middle-man) repo for metadata, per package?

If fpm gets metadata from one repo, which would then instruct it to download the package tarball from a custom url and build it with some commands, then it would have to do that transaction every time you inquire about a package. To not query a remote repo on every command, you'd want to cache results, which basically means you'd be building a local registry of packages. But if you're building a local registry of packages, you might as well maintain the registry in one repo.

More problematically, without a local (or remote but aggregated) registry, how do you search for available packages? With Cargo I can do:

$ cargo search blas
blas = "0.20.0"                   # The package provides wrappers for BLAS (Fortran).
coaster-blas = "0.2.0"            # Coaster library for full BLAS support
rust-blas = "0.1.1"               # BLAS bindings and wrappers, fork of rblas
collenchyma-blas = "0.2.0"        # Collenchyma library for full BLAS support
blas-src = "0.4.0"                # The package provides a BLAS source of choice.
rblas = "0.0.13"                  # BLAS bindings and wrappers
blas-sys = "0.7.1"                # The package provides bindings to BLAS (Fortran).
cuda_blas = "0.1.0"               # cuBLAS API bindings.
popcorn-blas = "0.1.0"            # Popcorn BLAS: Broadcasting BLAS operations for Popcorn
netlib-blas-provider = "0.0.8"    # BLAS/LAPACK provider using the Netlib implementation
... and 54 crates more (use --limit N to see more)

Would fpm search only list urls? Or would it try to fetch metadata from any number of repos that match the pattern? This won't scale.

Looking at my local .cargo/ directory, it doesn't seem like Cargo keeps an index of all packages locally (for many many packages this doesn't scale either) but fetches from a remote registry (I assume cargo.io).

milancurcic commented 4 years ago

Regarding the design of fpm, I would do exactly what Rust does. So a standard layout (which however can be disabled if you don't like it from Cargo.toml), pure Rust is automatic, non-Rust parts are compiled by hand by writing a build script (and listing it in Cargo.toml, then Cargo executes it before building Rust parts). From the build script you can call cmake or whatever build system one wants.

Are you saying that for pure Fortran code (like stdlib at the moment), you wouldn't use a build system but do the build explicitly by directly invoking the compiler? In the long run I think this is a good choice but I'm worried that it'd be a big ordeal to implement because now you have to worry about building dependency trees and all the necessary stuff that CMake was doing for us.

Or, are you thinking about generating a CMakeLists.txt based on the scan of the source files and directories? Seems complex too.

certik commented 4 years ago

Or, are you thinking about generating a CMakeLists.txt based on the scan of the source files and directories? Seems complex too.

Yes, that's how it is already implemented in this very small prototype of fpm. If you look here:

https://github.com/fortran-lang/fpm/tree/master/tests/1

All you have to do is execute fpm build in that directory, and it will generate the proper CMakeLists.txt and build it and then fpm run will run the executable.

That's exactly how Cargo does it, and I think that's what we want also.

And yes, I agree with you that using CMake as the vehicle to actually build it is the way to go initially. All fpm has to do is to construct it properly.

It already works, and my next step is to start doing the dependencies. Once we have a prototype of that, let's brainstorm how to host it properly. (Yes, I want fpm search to list names and descriptions just like Cargo does, so it needs to download some JSON description of all packages --- But what I am arguing is to maintain such JSON description automatically, not by hand, by downloading it from the actual packages --- we can discuss it later.)