j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
174 stars 14 forks source link

Fortran needs packaging ecosystem #55

Open certik opened 4 years ago

certik commented 4 years ago

Most other languages have that, whether Python, Julia, Go, Rust, JavaScript....

Goals:

This needs to be carefully designed, we need to learn from the above mentioned languages.

Some related projects to consider:

Note: initially opened at https://gitlab.com/lfortran/lfortran/issues/109.

zmiimz commented 4 years ago

imho, one of the best to consider is the dub manager from dlang project (it is also AIO: package manager and build system) https://github.com/dlang/dub

septcolor commented 4 years ago

FWIW, this site gathers statistics for package registries of various languages. We can see more details by clicking the name of the registries. http://www.modulecounts.com/

sblionel commented 4 years ago

Looks like all the languages mentioned are interpreted. Keep in mind that there's more to the Fortran world than Windows, Linux and Mac. The Fortran world seems to have gotten along well with libraries without something in the standard for a package manager. These sorts of things tend to go obsolete quickly, anyway. I don't see it as something appropriate to add to the standard, which doesn't even discuss what source file names look like.

gronki commented 4 years ago

While it might not be a good thing to put in the standard per se, it's definitely something that needs initiative from the standarization side. For example, while the format of module files has never been standarized (probably for convenience of compiler vendors and annoyance of users), distributing Fortran packages is hell even in the current scope (for example withit one Linux distribution). So while the system itself is not dependent on the standard, to my knowledge the standard in its current form does not make it possible to build one.

niedz., 3 lis 2019 o 21:43 sblionel notifications@github.com napisał(a):

Looks like all the languages mentioned are interpreted. Keep in mind that there's more to the Fortran world than Windows, Linux and Mac. The Fortran world seems to have gotten along well with libraries without something in the standard for a package manager. These sorts of things tend to go obsolete quickly, anyway. I don't see it as something appropriate to add to the standard, which doesn't even discuss what source file names look like.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/55?email_source=notifications&email_token=AC4NA3LT7MKBKSCVTWJORYLQR4SW5A5CNFSM4JFTVA7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC54JWA#issuecomment-549176536, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC4NA3NDONJH74U2VLST45DQR4SW5ANCNFSM4JFTVA7A .

sblionel commented 4 years ago

Given that compiled Fortran objects are not interoperable with those from different compilers, much less modules, I don't see a way forward for this proposal. The standard offers features (especially submodules) that help library developers. Build from source works.

Keep in mind that the Fortran standard doesn't say anything about the world outside the "processor" (compiler). Source lines are delivered by fairies in the night, input and output files are up to the whim of the environment, etc. One can use any packaging system that suits your fancy. What does one do for C or C++?

gronki commented 4 years ago

Yes you are right about that. So the distribution system then should be based on source packages and compiled as the package is downloaded.

Everyone: With that knowledge, what meta information would have to be included in the package?

niedz., 3 lis 2019, 22:38 użytkownik Steve Lionel notifications@github.com napisał:

Given that compiled Fortran objects are not interoperable with those from different compilers, much less modules, I don't see a way forward for this proposal. The standard offers features (especially submodules) that help library developers. Build from source works.

Keep in mind that the Fortran standard doesn't say anything about the world outside the "processor" (compiler). Source lines are delivered by fairies in the night, input and output files are up to the whim of the environment, etc. One can use any packaging system that suits your fancy. What does one do for C or C++?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/55?email_source=notifications&email_token=AC4NA3OVNFWL43TSF7J7XKDQR4ZGBA5CNFSM4JFTVA7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC55RAQ#issuecomment-549181570, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC4NA3N4PEK6AOD5VJUP7P3QR4ZGBANCNFSM4JFTVA7A .

certik commented 4 years ago

@sblionel, from the mentioned languages, Go and Rust are compiled, and Julia is a hybrid. Regarding your second question, there are two package managers specifically for C++: Conan and Vcpkg. A language neutral package managers that many recommend for C++ (and Fortran!) are Spack and Conda (already linked in the issue description above).

@gronki thanks for the feedback.

Fortran needs a standard way to create and distribute libraries. There is a lot to improve.

What is not clear at this stage is what things, if anything, needs to be improved in the Fortran standard itself. But there might be some things to improve there, and for that reason I would like to keep discussing it here.

I discussed this issue with many people, and there are generally two camps: a language specific package manager (like Julia, Python, ...) and those who advocate for packaging all languages (such as C++, Fortran, Python, ...) in language neutral package managers such as Conda or Spack.

I would much prefer if we can figure out ways to just use Conda, Spack or another solution, so that we do not need to maintain things ourselves. However, there might be some Fortran specific things that we might need to figure out.

Regarding building from source (Spack) or distributing binaries (Conda), I think we need both. We need to build from source, as that is what is needed on HPC to build an optimized static build for a specific architecture, but also be able to distribute binaries (Conda) is very helpful for users that just want to get something working quickly and do not want to wait hours to build all the dependencies.

septcolor commented 4 years ago

FWIW, D, Nim, Chapel (as well as Rust and Go) are also compiled languages, each of which has its package repository. Examples of major registries include...: Dub for D, Crates for Rust, and Gopm for Go, Hackage for Haskell, and so on.

Personally, I think this kind of package registry does not (necessarily) have to be "in the standard", but it is very nice if such a repository allows users to find "good" packages + install them efficiently with least troubles. The ideal situation is that a package registry (or manager) provides a search mechanism for candidate packages, show the degree of maintenance level explicitly (e.g. by showing validation/test results for major compilers/versions), show dependence (e.g. 3rd-party libraries/versions required), explicitly state license (to facilitate open-source use), and provide feedback mechanism such as popularity measures and issue reports...

septcolor commented 4 years ago

As for Rust, it is not only a new language but also the "most loved" one in the StackOverflow survey (2019) https://insights.stackoverflow.com/survey/2019#most-loved-dreaded-and-wanted and seems even considered as a possible replacement of C/C++ (according to Microsoft) https://visualstudiomagazine.com/articles/2019/07/18/microsoft-eyes-rust.aspx https://msrc-blog.microsoft.com/2019/07/18/we-need-a-safer-systems-programming-language/ so I guess it may also provide a useful reference for various aspects, including package management (in comparison to more traditional languages like C++ and Java).

everythingfunctional commented 4 years ago

In order for this to work well, you would need to tie all the packages together with a standardized build tool. I've started putting the beginnings of this together in my own packages, but I haven't formalized it or properly automated the package management side of it yet. Basically, I put together a build tool that can scan the source tree and determine the dependency tree. Then I just manually add the src folder to the list in the build system and use git submodules to manage the dependencies. Take a look here and let me know what you think. It only works if everything is in modules and doesn't deal with submodules yet. I also have extended it to work with linking in C/C++ code in one project.

traversaro commented 4 years ago

Regarding your second question, there are two package managers specifically for C++: Conan and Vcpkg.

As it may be relevant that there have been some effort in the past to add support for Fortran in vcpkg, even if until now it has not been merged upstream:

certik commented 4 years ago

@traversaro thanks for the update!

We are developing a Fortran Package Manager (fpm) here: https://github.com/fortran-lang/fpm/, anyone is welcome to join us. It's very much work in progress, we will announce it once it is ready for users. If anyone wants to help us get there faster, please definitely join.

everythingfunctional commented 4 years ago

To follow on @certik comment, the latest developments have made fpm usable, provided you have no dependencies. That's the next step. I have some vague idea about how to implement a minimal version, but I need to find a few hours to dedicate to it.

wolfv commented 4 years ago

FPM is implemented in Haskell?

We are actively working on mamba (https://github.com/quantstack/mamba) again, which is becoming a complete rewrite of conda in C++ -- this will shed the dependency of conda for a Python interpreter and make it much more lightweight. In the end, you'll be able to drop a statically compiled binary on a system and use it as package manager -- and it works on Linux, Windows and OS X. Mamba is also based upon well established dependency management libraries (libsolv, and libcurl / libarchive). So not too much NIH.

So far we're following conda's ideas very closely to make it 99% interoperable with existing conda packages and environments.

We are also toying around with the idea of adding source distribution capabilities, which would be part of mamba & the yet to be made mamba-build.

If the only thing you're missing from Conda is the ability to distribute source easily, maybe we can formulate a plan together to add this to mamba? I think yet-another language specific package manager is not the way to go (but I am not a fortran expert so there might be good reasons, which I didn't see in this thread at least).

certik commented 4 years ago

@wolfv thanks for getting in touch. Yes, you and I talked about this, and I also talked at length with @SylvainCorlay and discussed at your Gitter about this exact question. We also discussed with the Julia developers a few times.

FPM is still just a prototype, I started it in Rust, but I really wanted @everythingfunctional to join our effort and he already had a similar version implemented in Haskell, so I convinced @milancurcic to switch to Haskell for the prototype. For the production version, I still think it should be Rust or C++, to make it easier for people to contribute. But let's discuss that later, for the prototype it doesn't matter from the user perspective, as long as it produces a statically linked binary, which Haskell does.

About 80% of the arguments are the same for Rust as for Fortran. So let's discuss Rust, because it already has a mature ecosystem. Why couldn't Rust just use Conda? There are multiple reasons:

In addition to these, Fortran has a few specific things:

FPM will also have Fortran specific knowledge, such as figuring out the dependencies between modules, and enforcing proper module naming convention based on where things are in the filesystem, and enforcing a Fortran specific layout. I don't know how that could be done with Mamba, as this is really Fortran specific.

Also, we want FPM to eventually become the default front end to Fortran: compiler independent invocation (i.e. you can use a compiler of your choice, and FPM will figure out the different ways Fortran compilers are being called), create a new project easily, all kinds of checks, automatic formatting, etc. (just like Cargo does this to rust --- you don't call rustc by hand, you just call cargo).

In general, we are aiming for a smooth and nice user experience, just like Cargo delivers it for Rust.

Let's discuss more if you are interested.

wolfv commented 4 years ago

Thanks for the lengthy reply! I know you did your homework thoroughly :)

Regarding source distribution: I don't see anything that would prevent this in Mamba -- conda packages are (almost) just tarballs of whatever was installed into the prefix that wasn't there before. So if your build script just copies the source over to some magic directory, then I think that's totally fine.

I understand that it's nice to have the build system and the package manager integrated tightly. In my opinion those are two slightly different roles.

We definitely want to do a conda-compatible mamba-build as well which should be much faster. With conda-build or mamba-build nothing prevents you today from adding a package fortran-build-scripts that contains some shell scripts, depends on cmake etc. so that building Fortran packages becomes a one-liner in the meta.yaml. Here is a sample meta.yaml (for others, that's how one expresses dependencies and build steps in conda):

package:
  name: my_super_fortran_pkgs
  version: 0.12.2

source:
  path: https://.../download.tar.gz

build:
   script: fcomp -DSOME_ARG -MHELLO_WORLD

requirements:
  build:
    - my_fortran_buildscripts
    - {{ FORTRAN_IMPL }}
  host:
    - some_dependency 0.14.*

In this case, fcomp would be a shell script (or some other executable) that's part of the my_fortran_buildscripts package. I have a conda-forge enhancement proposal that I want to push next week that would add these kind of build scripts to conda-forge at least for CMake and autogen.

One other thing I want to mention: I think the API surface of mamba is somewhat cleaner. For example, here is an example on how one can use the mamba API from Python to get a solution for a set of package specs:

https://gist.github.com/wolfv/cd12bd4a448c77ff02368e97ffdf495a

So if you wanted to you could also build on top of Mamba (and conda-packages) and implement the build system as a part on top of mamba (the same APIs shown in Python are obviously available from C++ as well). These API's will cover everything from prefix activation to repodata downloading and then to package dependency solving and installation.

I would be incredibly excited if you decided to do this with us, and obviously I would be happy to discuss this further.

certik commented 4 years ago

@wolfv thanks for the reply. Yes, we would love to collaborate!

Here is what we really care about: the end user experience. Here is are initial tutorial that explains how to use fpm:

https://github.com/fortran-lang/fpm/blob/ed5dd080d45ea4a409e63a5f9b2ff26f1d82d2db/PACKAGING.md

Everything in there already works with the current fpm, but obviously fpm is still a prototype. As I mentioned, it is heavily inspired by Cargo, so if you want to play with a good well designed production tool, play with Cargo a little bit.

We are completely open about the underlying technology, but we really care about the end user experience, which we want to be exactly (or very close to) what is in the above PACKAGING.md document. The key part is that users just write a simple fpm.toml file:

name = "hello"
version = "0.1.0"
license = "MIT"
author = "Jane Programmer"
maintainer = "jane@example.com"
copyright = "2020 Jane Programmer"

and fpm figures out how to build the project from the file layout (the same idea as Cargo). It knows how to build the application / executable (if present), library (if present) and tests / benchmarks (in the future).

So for example, we do want fpm to be able to generate a Conda package, in fact we already have an issue for it: https://github.com/fortran-lang/fpm/issues/70

In there the easiest would be to simply call fpm from meta.yaml. That's similar to what you mentioned in your last comment.

Once we have dependency management (we'll start working on that very soon) there might be a way to link with mamba to help out there too.

wolfv commented 4 years ago

What I am proposing is to use mamba as the tool to do everything related to "build-environment" and dependency management, as well as installing third party dependencies (or sources) into the environment.

I believe you could already achieve that with what we have in mamba today:

You can define some dependencies and install them into a build environment, then activate the environment (prefix) and build your package in that context. We might have to think about how we can do source packages well in mamba / as conda packages but I am convinced that there are great solutions out there that don't require a lot of work to get done.

Do you guys have some sort of regular meeting / video chat? I would be happy to drop by to see how we could work together if you're interested.

wolfv commented 4 years ago

This is the basic mamba CLI right now which can create new prefix's based on conda packages: https://gist.github.com/wolfv/4827a7c18ffae89242cbc46ddf012b4e

certik commented 4 years ago

@milancurcic literally just yesterday suggested to have a video chat. @milancurcic would you have time to set it up with @wolfv, @everythingfunctional and others? Let's brainstorm this.

Honestly, using Conda for non-Fortran dependencies especially on macOS and Windows would really make the user experience awesome. Things like HDF5 are notoriously long to install and just being able to install a binary would go a long way. For Fortran stuff I think we still want to build them ourselves, but let's brain storm. I think there is a huge opportunity for collaboration.

milancurcic commented 4 years ago

Do you guys have some sort of regular meeting / video chat? I would be happy to drop by to see how we could work together if you're interested.

@wolfv Great, thank you, I appreciate your time! I sent an email.

odiferousmint commented 3 years ago

Just because no one has mentioned it yet: OCaml has a pretty great package manager too, called opam: https://opam.ocaml.org/ (https://github.com/ocaml/opam). In all fairness, I never built it from source, I just grab the binary, which is all you need to run opam. For the curious, on my current system ldd /usr/bin/opam prints:

    linux-vdso.so.1 (0x00007ffddfc59000)
    libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f8c17e81000)
    libglpk.so.40 => /lib/x86_64-linux-gnu/libglpk.so.40 (0x00007f8c17ba2000)
    libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f8c17b8f000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f8c17b73000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8c17a24000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f8c17a1e000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f8c17a01000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8c1780f000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f8c1895f000)
    libcolamd.so.2 => /lib/x86_64-linux-gnu/libcolamd.so.2 (0x00007f8c17806000)
    libamd.so.2 => /lib/x86_64-linux-gnu/libamd.so.2 (0x00007f8c177fb000)
    libltdl.so.7 => /lib/x86_64-linux-gnu/libltdl.so.7 (0x00007f8c177f0000)
    libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f8c1776c000)
    libsuitesparseconfig.so.5 => /lib/x86_64-linux-gnu/libsuitesparseconfig.so.5 (0x00007f8c17765000)

There is also a popular build system developed by Jane Street: https://opam.ocaml.org/packages/dune/.