Uthar / nix-cl

Utilities for packaging ASDF systems using Nix
BSD 2-Clause "Simplified" License
15 stars 7 forks source link

Questions #1

Open teu5us opened 2 years ago

teu5us commented 2 years ago

Glad my issue about reusing quicklisp-to-nix was noticed.

I have started writing an overlay for Common Lisp in December but there were questions I could not answer myself. Now I think that I overcomplicated stuff. Anyway, here are the things I could not decide on:

  1. Usually we have sevaral .asd files with different systems. Should those be separate nix packages or not?
  2. Should we run tests or they can be ignored?
  3. Systems can have different dependencies on different implementations (noticed in tests, where sbcl has sb-rt and ccl requires rt)

Wonder what your thoughts on these are.

Also, I thought that CL in nix should not depend on quicklisp, having its own package set. But earlier today I had a thought that that could have been too much.

For constructing CL_SOURCE_REGISTRY I had a function similar to the one I found in your code. I have just written a setup-hook to replace that here.

With this mess I wanted to generate package definitions without quicklisp.

Overall my nix knowledge is not really great, and I would love to contribute instead of writing from scratch.

teu5us commented 2 years ago

I remember now that I had problems with the old implementation and quicklisp in general, where either I could not download anything from quicklisp for unknown reasons or hashes did not match when using nix

Uthar commented 2 years ago

I have started writing an overlay for Common Lisp in December but there were questions I could not answer myself. Now I think that I overcomplicated stuff. Anyway, here are the things I could not decide on:

Welcome to the club - there's also https://git.sr.ht/~remexre/clnix by @remexre

For some weird reason a couple of people start working on a similiar thing at a roughly same time - maybe it's the chakra thing.

Myself, I started writing my own overlay somewhere in September, because I wanted to have lisp wrappers that used libraries which weren't yet in nixpkgs. I found adding them harder than rewriting from scratch. If it's not a secret, what was your motivation?

Yeah, ASDF is pretty complex and has some mutability in the design which shows when using it the Nix way. For example, slashy systems only make sense (to me) if the directory containing the .asd is mutable and things can be compiled as needed. But maybe you disagree?

  1. Usually we have sevaral .asd files with different systems. Should those be separate nix packages or not?

Wall of text ahead.

Let's say a project contains foo.asd and bar.asd. One packages foo as a nix package and places it into the immutable /nix/store. Now loading foo works fine. Later on, one might want to package bar as well. One puts both store paths in the source registry. Now ASDF will, depending on which system appears first in CL_SOURCE_REGISTRY, either try to load foo from ${bar}/foo.asd or from ${foo}/foo.asd, and, load bar from either ${bar}/bar.asd or ${foo}/bar.asd. In the ${bar} case, foo fails and bar succeeds and In the ${foo} case, foo succeeds and bar fails. This is because ${foo} only contains the fasls of foo, and vice versa. Now loading triggers a write attempt to /nix/store to compile the missing fasls.

TLDR: I think they should be separate packages, but with the added invariant that only one, unique .asd file exists per package - which is one solution to the above problem, but maybe there's a better one?

2. Should we run tests or they can be ignored?

ASDF test-op doesn't signal an error on test failure by itself, which has to be done by the package authors. So I'm not aware of any was to sensibly run mass-tests so that CI fails if they don't pass. Maybe one way would be to patch the popular test libraries, and inject them for CI builds?

3. Systems can have different dependencies on different implementations (noticed in tests, where sbcl has sb-rt and ccl requires rt)

If the dependencies are provided by the CL implementation then I think it's fine - the package manager doesn't have to care. But if a library imports different external dependencies per implementation then that's pretty bad IMHO and the difference should be pulled out to a compatibility library. I'm not sure thought if this is what you meant

Also, I thought that CL in nix should not depend on quicklisp, having its own package set. But earlier today I had a thought that that could have been too much.

I think long-term this is what could happen - for now QL is just an easy way to import 1000s of existing libraries into the Nix system. But probably this will not happen, because on Windows QL is still really the only option and will receive maintenance. I haven't really dig into QL that much to talk about any improvements or issues on it.

For constructing CL_SOURCE_REGISTRY I had a function similar to the one I found in your code. I have just written a setup-hook to replace that here.

This seems similiar to the current lispPackages in nixpkgs in that it uses the special /lib/common-lisp directory. I haven't used the setup-hook system that much to have a great understanding though. From what I understand this code will populate the source registry when a script is run. What I did here is also populate the variable but put it in the derivation itself, and by walking the in-memory dependency tree instead of the file system files. I kind of like this approach, primarily because I personally prefer to have more Nix code as opposed to shell code. I don't know it just seems less 'dirty'

With this mess I wanted to generate package definitions without quicklisp.

From what I see you are also loading the systems in order to generate the nix expressions. This has the advantage that you actually know if something builds, but the downside that it takes really long, the generator script has to be written in CL and can't have any dependencies. This is also what's done currently in nixpkgs, and was one of the biggest pain points for me when trying to add new packages - figuring out the cache directory is a mess, plus, I had to have quicklisp actually installed on my computer.

What I went with is kind of different: I discovered two files that are part of QL: releases.txt and systems. All the generator does is consume that data to generate the nix expressions. It's naive in that it relies on the QL data being accurate, but in practice it works really well, so far. The generator is pretty slow right now and it takes a couple of seconds but could be optimized. I have the idea to also generate an SQLite database of the data to run interesting queries on the QL package distribution.

The generator is written in CL, but could be any other language, probably even Nix itself. And Quicklisp doesn't even need to be installed which frees me from having impurity in my setup - i run everything using the lispWithPackages. The nix-cl one is "self-hosted" this way, and here's a bigger example using more native libraries.

The generator tool only works inside of a nix-shell, providing build dependencies for the systems being checked. Do we actually have to build everything before resolving lisp dependencies? We will find out later.

The releases.txt file linked above contains dependency information for each system - with the assumtion that this information is correct, no building has to be done.

But, QL files unfortunalety do not provide such dependency information when something depends on a "slashy" system. Mass-building slashy systems using the nix wrapper seems like a hard problem because of that.

teu5us commented 2 years ago

I think I wrote about my motivation in the cl2nix repo. Roughly speaking, I was disappointed by all the options to use CL under NixOS.

I did not want to depend on quicklisp directly, so I decided to use the quicklisp-projects repo instead as a list of project names with links to sources. Some links there are broken though, so we still have to pull from quicklisp. Then use tools that exist in nix to download sources, get hashes and other stuff for describing new packages.

I don't think that I completely understand the issues with the "slashy" systems. Could you elaborate on that?

Not depending on quicklisp made me look for ways to discover dependencies, and as ASDF is the thing that loads everything, it should know how to do what I want, that is what I thought. But of course we need to load the system first. However, load-asd seems to only import metadata without loading and compiling the actual system. Then I get the system object with find-system and work with that.

Could you give an example of a problematic "slashy" system?

I replaced the tree walker (which I also failed to write correctly as I found out later) because I found propagating lisp packages more logical, because they are propagated afterall, but that seems subjective to me. Using the lib/common-lisp directory does not change anything here. The setup hook is a native build input that is called before building a package (and lispWithPackages).

Maybe I will not separate systems into different packages, just search for the needed package by its provided systems. UPD: this is exactly what I have just done.

I think I will try to generate expressions with what I have now and later experiment with what I get. I am still not sure about what dependencies need propagation and how to manage external stuff (LD_LIBRARY_PATH, CPATH, PKG_CONFIG_PATH and friends) correctly.

teu5us commented 2 years ago

Having a look at quicklisp-controller, it seems that :require and :version keys are ignored, but I may be wrong. Also, from what I can tell, quicklisp considers only sbcl contribs important and removes them from dependency list. This mathes what I have seen in tests with sb-rt and rt.

Those make me think that indeed such differences can be ignored.

However, quicklisp-controller evaluates :feature expressions in system definitions, so I wonder what it does with those.

Uthar commented 2 years ago

Currently reading your source code to get a better idea about the process. Hmm it's kind of cool to treat the asd files as a source of truth about system dependencies etc.

I don't think that I completely understand the issues with the "slashy" systems. Could you elaborate on that?

The problem that I'm trying to work around is that slashy systems end up "providing" the same asd file as their parent system. So when using a store-path-per-package approach, there has to be one package combining a bunch of systems. They can't be added to a lisp wrapper ad-hoc unless nothing else in the dependency tree depends on the parent system, because if something does, now there's two same-named asds in the source registry again. Currently that's driving me a little crazy, but I've got some ideas on how to at least partially handle it.

Normally with quicklisp you could just load up the slashy system as you need it, but with the lisp wrapper it has to happen ahead of time. I.e. ~/quicklisp can be modified in place but with Nix a new package has to be created for any changes.

Then I get the system object with find-system and work with that.

Ive taken a quick look in the slime inspector - it looks really good, similiar to the stuff from systems.txt. Have to dig deeper at some point.

Could you give an example of a problematic "slashy" system?

Here's one easy example: Let's say you want to build a lisp wrapper with alexandria+ and alexandria+/tests systems. But ASDF falls over because there's 2 alexandria+.asd's.

This could be easily fixed by merging the systems: building one package which contains fasls of both systems.

If something depends on alexandria+ though, it has to be replaced with the merged version to prevent the same double asd problem. I'm not sure yet how to do that while keeping the integrity of everything.

One particular problem was that nyxt depends on iolib/os. which wasn't declared in the systems.txt from QL. But now with asdf:find-system, I can see that dependency there. Which makes me very happy! There's a chance to combine the workarounds with merging of systems into their parent package. plus the use of .asd as the source of truth to have a good handling of autoimported slashy systems.

If only asdf:find-system could be used without building or requiring the dependent systems to exist on the file system. Maybe its possible, I havent tried.

I still have a lot of thinking to do... this all may not even make sense.

I am still not sure about what dependencies need propagation and how to manage external stuff (LD_LIBRARY_PATH, CPATH, PKG_CONFIG_PATH and friends) correctly.

From my side, for now I just pass LD_LIBRARY_PATH and CLASSPATH down, and haven't had problems - but I'm interested to read how you handle it.

Reply on old quicklisp: I have only used lisp for maybe a year so haven't experienced the old problems.

teu5us commented 2 years ago

If only asdf:find-system could be used without building or requiring the dependent systems to exist on the file system. Maybe its possible, I havent tried.

We don't have to build anything. Once .asd is loaded, ASDF knows how to find-system any system defined in the loaded .asd, and that's enough for ASDF to extract all the needed info. Downloading the source is needed, though, to calculate its sha256 hash. The the source is put in the nix store.

You can try it in a clean environment like so:

  1. Download cl2nix
  2. Run a nix shell with sbcl:
(require :asdf)
(asdf:load-asd "</path/to/cl2nix.asd>")
(require :cl2nix)
(in-package :cl2nix)
(defvar sl (read-source-list-file "</path/to/projects.sexp>")) ;; it is in the source-list directory in the repository
;; find a project by name
(gassoc :name "<project-name>" sl)
;; describe it
(describe-source *) ;; returns a `nix-source-description' object which you can inspect in sly/slime

You will see there is no building involved.

Let's say you want to build a lisp wrapper with alexandria+ and alexandria+/tests systems. But ASDF falls over because there's 2 alexandria+.asd's.

I think I get it now. This is where I got stuck, actually. Maybe we could split the entire source into different packages if we track what files each system depends on (and this is at least partly doable with ASDF as far as I have researched this).

However, I decided to not go the package-per-asd way. Instead, all the .asds are contained in one package and system names are exposed through providedSystems. So to depend on some "slashy" system we can either put its name in the lispInputs attribute, which is used to find the right package through providedSystems and put it in the propagatedBuildInputs, or if we know the right package we can just straight add it to the propagatedBuildInputs ourselves. This seems to me more natural.

From my side, for now I just pass LD_LIBRARY_PATH and CLASSPATH down, and haven't had problems - but I'm interested to read how you handle it.

I'm sure that this works for building packages, but is this enough for using them?

I have only used lisp for maybe a year so haven't experienced the old problems.

I'm not even a programmer (professionally, at least) and my experience with lisp is roughly the same by age I think. I just find CL the most convenient to use when I need something done, and it upsets me that I cannot comfortably use it under NixOS.

teu5us commented 2 years ago

There is another approach that might work for slashy systems, that I'll have look at later.

As you can see here, load-asd consumes system name as an optional argument. My assumption is that if we (asdf:load-asd "/A/alexandria.asd" :name "alexandria") and then likewise load alexandria/tests from another directory, ASDF should indeed look for those systems in different locations.

This could be used to make ASDF aware of required systems instead of using CL_SOURCE_REGISTRY, which can cause duplication as you pointed out.

Nix should also be aware what system name and .asd to pass to the package builder, so the target system and it's corresponding .asd need to be exposed in the package definition.

Builder and lispWithPackages could use an old-style asd-farm to bulk load definitions (ASDF follows links to their corresponding file locations) or load definitions through a dolist expression generated by nix.

Systems like hunchentoot need to be checked here because they define several systems in one .asd.

UPD: Checked hunchentoot. (asdf:load-asd "path/to/hunchentoot.asd" :name "hunchentoot") still allows to find-system "hunchentoot-dev".

UPD2: These return different paths:

(asdf:load-asd "/home/suess/gits/cl-dbi/cl-dbi.asd")
(asdf:load-asd "/home/suess/gits/cl-dbi2/dbi.asd")
(asdf:system-source-directory (asdf:find-system "cl-dbi"))
(asdf:system-source-directory (asdf:find-system "dbi"))
Uthar commented 2 years ago

I'm sure that this works for building packages, but is this enough for using them?

For now I've used a bunch of libraries that use native .so's, such as cl-opengl, cl-liballegro, cl-sqlite, classimp, cffi-libffi. I've been writing a game with that and had no problems. Tried all the CL implementations, and they all seem to find the shared libraries just using LD_LIBRARY_PATH.

Similiar for Java libraries for ABCL, but maybe that's a niche.

osicat was a fun one, because it compiles it's own shared library on build-time, which it then uses on run-time. But it works now, too.

Uthar commented 2 years ago

Systems like hunchentoot need to be checked here because they define several systems in one .asd.

Another such case is cl-async-base and cl-async-util defined in cl-async.asd, there's a bunch more, though - mostly "${foo}-test" kind of systems.

Uthar commented 2 years ago

Nix should also be aware what system name and .asd to pass to the package builder, so the target system and it's corresponding .asd need to be exposed in the package definition.

I agree, the .asd parameter is also useful to detect duplicates in a lisp wrapper's source registry - and error out before building such a broken derivation

Uthar commented 2 years ago

This seems to me more natural.

I agree, this seems more natural and more like the way other Nix code does it. I'll have to experiment with that - the Nix code for parsing lisp dependency trees appears somewhat crazy compared to this.

Uthar commented 2 years ago

As you can see here, load-asd consumes system name as an optional argument. My assumption is that if we (asdf:load-asd "/A/alexandria.asd" :name "alexandria") and then likewise load alexandria/tests from another directory, ASDF should indeed look for those systems in different locations.

This could be used to make ASDF aware of required systems instead of using CL_SOURCE_REGISTRY, which can cause duplication as you pointed out.

Nix should also be aware what system name and .asd to pass to the package builder, so the target system and it's corresponding .asd need to be exposed in the package definition.

Builder and lispWithPackages could use an old-style asd-farm to bulk load definitions (ASDF follows links to their corresponding file locations) or load definitions through a dolist expression generated by nix.

I think I understand, but please correct me if I'm wrong. With this, some lisp code could be run (before starting the repl) to set up ASDF in the way you described, so that people could use load-system transparently. Hmm this sounds good.

Uthar commented 2 years ago

Builder and lispWithPackages

By lispWithPackages did you mean build-time? Or run-time?

teu5us commented 2 years ago

I think I understand, but please correct me if I'm wrong. With this, some lisp code could be run (before starting the repl) to set up ASDF in the way you described, so that people could use load-system transparently. Hmm this sounds good.

In buildPhase we load all the dependencies' asds, then the target asd and call load-system for our target

For lispWithPackages I use makeWrapper, where I set variables and execute lisp code prior to interactive use

teu5us commented 2 years ago

By lispWithPackages did you mean build-time? Or run-time?

I mean runtime

teu5us commented 2 years ago

I do like the search-by-system-name approach, though. We can just ignore the inner package dependencies, because all asds are in the same path and will be in the source registry anyway when they are needed. And no need to separate system files or produce a ton of duplicate files in the store

Uthar commented 2 years ago

I cloned the repo in two places: https://github.com/Symbolics/alexandria-plus

Doing:

* (asdf:load-asd (truename "~/1/alexandria-plus/alexandria+.asd") :name "alexandria+")
* (asdf:load-asd (truename "~/2/alexandria-plus/alexandria+.asd") :name "alexandria+/tests")
* (asdf:system-source-directory :alexandria+)
#P"/home/kpg/1/alexandria-plus/"
* (asdf:system-source-directory :alexandria+/tests)
#P"/home/kpg/1/alexandria-plus/"

Or, if I load the second one first ( in a fresh repl ):

* (asdf:load-asd (truename "~/2/alexandria-plus/alexandria+.asd") :name "alexandria+/tests")
* (asdf:load-asd (truename "~/1/alexandria-plus/alexandria+.asd") :name "alexandria+")
* (asdf:system-source-directory :alexandria+/tests)
#P"/home/kpg/2/alexandria-plus/"
* (asdf:system-source-directory :alexandria+)
#P"/home/kpg/2/alexandria-plus/"

So seems to work for different asd's in one directory, but not for different systems in one asd.

teu5us commented 2 years ago

So seems to work for different asd's in one directory, but not for different systems in one asd.

As I said about hunchentoot, it loads everything despite the :name argument. And it seems to keep the first definition it found.

Uthar commented 2 years ago

You're right, sorry. Somehow I didn't connect that in my brain

teu5us commented 2 years ago

Nix code for parsing lisp dependency trees appears somewhat crazy compared to this

I tried collecting the asds and it seems to require similar crazy nix code.

Have a look at this, this, this and this.

Packages are resolved by system names and then collected through a setup hook.

teu5us commented 2 years ago

Have you thought of using the pkgs.setJavaClassPath hook?

Uthar commented 2 years ago

Sry for dissapearing, got some stuff IRL

Thanks for the instructions: I ran the code and it works really well - I just had to install nix-prefetch-git for some of the sources.

So for now I understand that about the import mechanism of cl2nix:

  1. uses source code straight from the origin
  2. uses asdf:load-asd and friends to get information about packages, e.g. dependencies

Which is great, because of being independent on Quicklisp. To download lots of sources can be slow, but that could only be done once and saved in a database or a file - so not really a problem.

And about the setup hooks:

  1. Creates a scope where all the packages exist
  2. Each package passes names of required systems, which are later taken from this scope
  3. CL_SOURCE_REGISTRY is set in a transparent way using the hook script. This uses mkDerivation's propagatedBuildInputs set in step 2. to discover them

Which I haven't played around with yet, but I'm curious if it would somehow make it possible to handle circular dependencies gracefully. Other than that seems like a different implementation of the same idea

When it comes to what I'm up to:

Another problem is the missing dependency info from systems.txt, which right now makes it necessary to manually add the missing slashy system dependencies. This is where I would steal some of your code.

I'll also want to take a look at the other clnix code, when I have more time

So to briefly summarize what's going on:

  1. Missing dependency info in systems.txt
  2. Duplicate asd's with slashy systems
  3. Circular dependencies with slashy systems

Possible solutions:

  1. Switch from systems.txt to parsing asd's from source code
  2. Merge slashy subsystems into parent
  3. Maybe switch to propagatedBuildInputs?
teu5us commented 2 years ago

That works well, unless merging the systems' dependencies introduces a circular dependency. Example: merging s-sql with s-sql/tests, which depends on postmodern, yet postmodern depends on s-sql.

This could be solved by treating projects as packages, not systems. Thus we could just merge all asd dependencies and exclude internal ones. I will experiment with that when I have time

teu5us commented 2 years ago

that could only be done once and saved in a database or a file

That's what I want to do but can't find time to implement

Uthar commented 2 years ago

This could be solved by treating projects as packages, not systems. Thus we could just merge all asd dependencies and exclude internal ones.

This does sound really good. So one package can provide multiple asd's. Then, if one uniquely named asd exists in any source registry, it still works. I'll experiment myself.

can't find time to implement

haha, me too

Github could use a chat feature for this kind of communication. Or maybe programmers hang out somewhere else I'm not aware of.

teu5us commented 2 years ago

This does sound really good. So one package can provide multiple asd's. Then, if one uniquely named asd exists in any source registry, it still works. I'll experiment myself.

See here. I can change this to display a message about systems, for which packages weren't found.

Github could use a chat feature for this kind of communication. Or maybe programmers hang out somewhere else I'm not aware of.

You could dm me in Matrix or we could even start a channel to involve some more people

teu5us commented 2 years ago

Updated the resolver to report systems, for which packages were not found.

nix-repl> lp.packages.sbcl.resolveLispInputs [ "alexandria+" ]
error: Packages were not found for systems:
         "alexandria+"

nix-repl> lp.packages.sbcl.resolveLispInputs [ "alexandria+" "hunchentoot" ]
error: Packages were not found for systems:
         "alexandria+", "hunchentoot"

nix-repl> lp.packages.sbcl.resolveLispInputs [ "alexandria" ]
[ «derivation /nix/store/26dryffyy6sz071bjjc7rzqiqybn6fww-alexandria-1.0.1_sbcl-2.1.9.drv» ]
teu5us commented 2 years ago

I could not understand from your code if java libs are optional dependencies. They should be included only when running abcl, I think

Uthar commented 2 years ago

I could not understand from your code if java libs are optional dependencies. They should be included only when running abcl, I think

Yes, right now they're included for all implementations. The API will stay the same, because initially javaLibs is empty, but you're right that the internals will have to change

This made me remember one interesting topic: C++ libraries for Clasp, which would have to be provided as source code and compiled into the clasp binary: https://clasp-developers.github.io/clbind-doc.html