Closed bzizou closed 6 months ago
Hi Bruno @bzizou,
thanks for putting together a NIX package for grenedalf! I've vaguely heard about NIX, but not more than that. I think that could be nice to have :-)
Is there a particular reason you want to circumvent the automatic dependencies in grenedalf? I am developing grenedalf and genesis in tandem, and hence have a tight binding of exact commit hashes. If you now take that apart, I can easily see the potential for mismatches of incompatible versions of genesis and grenedalf. I would avoid that, and just have grenedalf be build fully via its CMake setup.
As a matter of fact, the compilation errors you are seeing might come from mismatching versions, although I am not sure how. As far as I could tell on a quick glance, you seem to specify fitting versions of both. But the errors are exactly in a set of classes that I recently renamed in genesis (I am in the middle of a larger refactoring at the moment) - however, that's so far only on the genesis dev
branch, and neither on the master
branch, nor used in grenedalf yet. In particular, SamVariantInputIterator
is now SamVariantInputStream
, VariantInputIterator
is VariantInputStream
, etc. Some of the errors about function names are weird though, as some of them did not change. But might still be related. However, the weird thing is that the header files of those renamed classes also are named differently now. So I would have expected the compilation to fail at that stage already, and reporting that the old header files cannot be found. In some very weird way, it could be that your setup is using the correct (old) version of genesis that fits with the not-yet-adapted-to-the-new-class-names version of grenedalf, but somehow still has a mismatch with something there? No idea how that would happen, but it otherwise seems odd that the error is exactly in the classes that I recently renamed.
If it's not that, I don't have a good idea right now, and you'd need some more digging.
Also, I noticed a typo here, it should be genesis
. I'd recommend to ctrl+c and ctrl+v URLs instead of typing manually :-)
Cheers and so long Lucas
Oh, another though: if you get it to work, I assume that due to the non-automatic dependencies between grenedalf and genesis, someone would to manually keep maintaining the NIX package? That is, a new version of grenedalf can't be automatically also turned into a NIX package then, as the genesis commit would need to be set manually? Again, for that reason, I'd strongly advise against breaking the dependency between the two.
Also, if you do get it to work, if you want, let me know, and I can add a badge to the package to the repository :-)
Cheers Lucas
Hi Bruno @bzizou,
thanks for putting together a NIX package for grenedalf! I've vaguely heard about NIX, but not more than that. I think that could be nice to have :-)
Is there a particular reason you want to circumvent the automatic dependencies in grenedalf?
Yes. This is a base concept of NIX: reproducibility. And for that, it places the build processes into a "pure" environment that prevents a package building from connecting to the network, or from using a library of the current running system. That means also that every "external" dependency should be a nix package. But Nix is also built to allow tight dependencies as yours, and it is totaly possible to fix a github commit hash. That's what I did, and I used the tags you have put into your Cmake files.
I am developing grenedalf and genesis in tandem, and hence have a tight binding of exact commit hashes. If you now take that apart, I can easily see the potential for mismatches of incompatible versions of genesis and grenedalf. I would avoid that, and just have grenedalf be build fully via its CMake setup.
I'll double check the tags I used...
As a matter of fact, the compilation errors you are seeing might come from mismatching versions, although I am not sure how. As far as I could tell on a quick glance, you seem to specify fitting versions of both. But the errors are exactly in a set of classes that I recently renamed in genesis (I am in the middle of a larger refactoring at the moment) - however, that's so far only on the genesis
dev
branch, and neither on themaster
branch, nor used in grenedalf yet. In particular,SamVariantInputIterator
is nowSamVariantInputStream
,VariantInputIterator
isVariantInputStream
, etc. Some of the errors about function names are weird though, as some of them did not change. But might still be related. However, the weird thing is that the header files of those renamed classes also are named differently now. So I would have expected the compilation to fail at that stage already, and reporting that the old header files cannot be found. In some very weird way, it could be that your setup is using the correct (old) version of genesis that fits with the not-yet-adapted-to-the-new-class-names version of grenedalf, but somehow still has a mismatch with something there? No idea how that would happen, but it otherwise seems odd that the error is exactly in the classes that I recently renamed.
OK, so I really have to check that the dependencies are using the good commit hashes...
If it's not that, I don't have a good idea right now, and you'd need some more digging.
Also, I noticed a typo here, it should be
genesis
. I'd recommend to ctrl+c and ctrl+v URLs instead of typing manually :-)
Oh, thank you! It's because I copied/paste the grenedalf package file I prepared as a template, and just wanted to manually change "grenedalf" to "genesis", but kept an "r" ;-) My fault, I'll fix that...
Cheers and so long Lucas
Oh, another though: if you get it to work, I assume that due to the non-automatic dependencies between grenedalf and genesis, someone would to manually keep maintaining the NIX package? That is, a new version of grenedalf can't be automatically also turned into a NIX package then, as the genesis commit would need to be set manually? Again, for that reason, I'd strongly advise against breaking the dependency between the two.
You're right, but we can create an override in the grenedalf package (as I did with CLI11 here ) for it to use an explicit commit hash of the genesis package. That way, Grenedalf is binded to a specific version of Genesis. And another user/software can also use the latest version of Genesis without conflicts! That's the magic of Nix!
Also, if you do get it to work, if you want, let me know, and I can add a badge to the package to the repository :-)
Sure!
Cheers Lucas
That's what I did, and I used the tags you have put into your Cmake files.
So, someone will have to manually edit the NIX package files every time a new version of grenedalf comes out, so that the matching genesis version is put in there as well? Or is it possible to automate that? If so, I think that might be better :-)
OK, so I really have to check that the dependencies are using the good commit hashes...
Looking at your grenedalf package file again, it seems that NIX downloads grenedalf from GitHub, right? So I think that might be the issue then: The commit hash of genesis is set in the CMake file, but also is set as a git submodule, see here. I think that somehow your setup might pull genesis based on the submodule, and not on the hash in the CMake file! I'm still unsure how that would exactly cause the errors you are seeing - but it could be that.
The reason that I've set it up that way is as follows: Using git submodules just is the right tool for the job of using another git repo as a dependency. And this works well if you clone the repository with the --recursive
flag - the submodules get cloned as well. However, when downloading source from GitHub via the green "Code" button, or via the releases page, or cloning without the flag, then the submodules do not get pulled, and stay empty. So I'm using this CMake script to check for the presence of the dependencies, and download it "manually" (in the CMake configure phase). This allows users to simply use the green "Code" button on GitHub. If you have an idea for how to achieve that in a better way, let me know :-)
So, the way that this affects you is that the commit hash in the CMake file and the commit hash that the submodule is set to should be the same - otherwise, one of them won't work. I'm doing that manually (semi-manually actually, using this script), by updating the git submodule first to the commit that I want, and then setting the CMake file to that as well. It is awfully hacky, but I've not bothered to find a more elegant solution to that yet.
Anyway, I'd check what the git submodule is set to. It's weird, because the submodule hash of genesis is part of a grenedalf commit as well, so it should match, but maybe there is something off there?
Cheers Lucas
NIX has a "fetchSubmodules = true;" option for the fetching of the sources. I tried it with no more success... I checked versions used.. I even did a diff -r
on the sources for comparing a local build (which works...) and the package build. Genesis and Grenedalf sources were exactely the same. That's really weird ;-)
Genesis depends on the htslib 1.16 that is built as a dependency. As htslib is already packaged into Nix, I'm using it (I managed to change the version to 1.16, which was not made in the current commits of my packages), but still no success. Compilation flags differ a bit between the packaged htslib and yours. Do you think htslib could be implied into this?
Oh I think I got it now. Yes, it's related to htslib, but in a different way than you suggested. You have patched the IncludeHtslib.cmake
file, right? So, there is a compiler definition GENESIS_HTSLIB
being set there, see here. This is used in the genesis sources for a preprocessor check to see if any of my code that depends on htslib shall be included or not. This is because my older tool gappa does not need any htslib-related functions, and I did not want to have to build htslib for within gappa if not needed, and so all htslib stuff in genesis (sam/bam and vcf, mainly) can just be completely deactivated. All your errors above are in those parts.
So, I'd check if the GENESIS_HTSLIB
preprocessor definition is set in your setup. I understand why you want to set things up the way you do, but again, patching and changing things that way can easily lead to weird errors like this one, and can easily break in the future (or at least require re-patching) should I need to change any of the CMake and other setup things in genesis or grenedalf... I'll leave that up to you, but if there is a way to just do a vanilla build with no patches, that would probably be easier to maintain :-)
As for the differing compilation flags: That could cause some trouble when linking - I've had that before - but you'll need to test that. Maybe it works, if no contradicting flags are being set. I had issues before when for instance -fpic
was not set, or some pthreads related flags were different. Remains to be seen though :-)
OK, I see. As you said before, git submodules might be the right way to do this, and so I can create a unique package instead of trying to package Genesis independently and binding the Grenedalf package on it. But, correct me if I'm wrong, you're using git submodules in Grenedalf for the CLI11 and Genesis dependencies, which seems to be ok with Nix and the fetchSubmodules = true
option. But for Genesis and the Htslib dep, you're not, and you do a "manual" download of the htslib, right? Is there any reason for this? You might also use the submodules method, no?
Well, just to say that relying on fetchSubmodules = true
without independent Genesis package, the reported error disapeared and compilation goes further before failing with another error. I'm currently investigating, but we are not away from a solution, I think ;-)
OK, I see. As you said before, git submodules might be the right way to do this, and so I can create a unique package instead of trying to package Genesis independently and binding the Grenedalf package on it.
Well, just to say that relying on
fetchSubmodules = true
without independent Genesis package, the reported error disapeared and compilation goes further before failing with another error.
Nice, okay, that's progress! What is the new error? Maybe I can help with that?
But for Genesis and the Htslib dep, you're not, and you do a "manual" download of the htslib, right? Is there any reason for this? You might also use the submodules method, no?
Hm, don't quite remember why I decided to do it that way. As far as I recall: Given that submodules are only downloaded when using git clone --recursive
, we'd need a way to download htslib anyway if not present, and so decided to just do it that way, instead of the more complex setup with submodule + extra download in case submodule is not present. The reasoning was that I'd probably rarely need to update htslib, so the benefits of a submodule that can just be pulled were not as important here.
Hope that this is not making it too much more complicated for you now. All this packaging is something that I had honestly not super much planned for, as I was working mostly on my own on this... So, firstly, very happy to get your support to offer my tools on NIX! And secondly, if you think that a different setup would make things simpler or more straight forward, we can figure something out. Not sure how quickly I'll be able to implement that, but we can at least start thinking about it!
Cheers Lucas
Here we are!!
[...]
[ 96%] Building CXX object CMakeFiles/grenedalf.dir/src/tools/references.cpp.o
[100%] Linking CXX executable /build/source/bin/grenedalf
[100%] Built target grenedalf
buildPhase completed in 1 minutes 10 seconds
Running phase: installPhase
Running phase: fixupPhase
shrinking RPATHs of ELF executables and libraries in /nix/store/8cqvy8x7rxr5b0ki6llxp1ifhcs3sfg1-grenedalf-0.3.0
shrinking /nix/store/8cqvy8x7rxr5b0ki6llxp1ifhcs3sfg1-grenedalf-0.3.0/bin/grenedalf
checking for references to /build/ in /nix/store/8cqvy8x7rxr5b0ki6llxp1ifhcs3sfg1-grenedalf-0.3.0...
patching script interpreter paths in /nix/store/8cqvy8x7rxr5b0ki6llxp1ifhcs3sfg1-grenedalf-0.3.0
stripping (with command strip and flags -S -p) in /nix/store/8cqvy8x7rxr5b0ki6llxp1ifhcs3sfg1-grenedalf-0.3.0/bin
/nix/store/8cqvy8x7rxr5b0ki6llxp1ifhcs3sfg1-grenedalf-0.3.0
[bzizou@bart:~/git/nixpkgs]$ ./result/bin/grenedalf --help
,--;
\ \ ----.
,-----. ,----.;---. ,--. ,----. \ \ ---. | | ;-.----.
/ ---./ ;-,----. / .---'| , `.| | / .---' ,--' \ \ \ | | | .---'
| ( .---.| ´` .'| `-- | |` ` || `-- / .--. )/ .-. \ | | | `--,
\ --' || |\ \ \ `---.| | \ | \ `---.\ `--' // `-' \| '---.| |`
`----´ '`--' '--' `----'`--' `--: `----' `----' `------'.`------'| |
| |
========================================================////- ,-'
v0.3.0 (c) 2020-2023 by Lucas Czech
Usage: grenedalf [OPTIONS] SUBCOMMAND
Options:
--help Print this help message and exit.
--version Print the grenedalf version and exit.
Subcommands:
diversity Compute pool-sequencing corrected diversity measures Theta Pi, Theta Watterson, and Tajima's D.
frequency Create a table with per-sample and/or total base counts and/or frequencies at positions in the genome.
fst Compute pool-sequencing corrected measures of FST.
simulate Create a file with simulated random frequency data.
sync Create a PoPoolation2 sync file that lists per-sample base counts at each position in the genome.
citation Print references to be cited when using grenedalf.
license Show the license of grenedalf.
version Extended version information about grenedalf.
grenedalf: population genetic statistics for the next generation of pool sequencing
So, the resulting source code of the package: https://github.com/bzizou/nixpkgs/blob/grenedalf/pkgs/by-name/gr/grenedalf/package.nix
I still have a small patch to disable the local build of htslib: https://github.com/bzizou/nixpkgs/blob/grenedalf/pkgs/by-name/gr/grenedalf/genesis.patch
A very simple way to avoid it should be a simple Cmake flag, for example DISABLE_LOCAL_HTSLIB_BUILD to disable the parts I commented out. You could let it OFF by default, and I could set it to ON to use the htslib provided by Nix.
Anyway, thank you very much for your support as I can provide now an efficient way to my users for using Grenedalf in a reproducible way, as a pinned Nix package is absolutely unmutable!
Cheers!
Bruno
Awesome, happy to hear!
As for using a pre-installed htslib: Yeah, that would be nice to have, but would involve more than just deactivating the ExternalProject part in CMake. In addition, we'd need a way for CMake to find the htslib sources and binaries, so that they can be properly included and linked against... A quick search gave me this CMake script, which might do the trick, but that would need some more testing.
Also, if I may ask: Would you be interested in packing gappa for NIX as well? I think it should be easier, as it does not need htslib, but otherwise be pretty much the same that you did already here. You could open a PR in gappa adding a badge and an entry in the Setup, so that your name shows up as a contributor ;-) - Just asking, as this seems it might be easy for you now, but of course I understand if that's not what you wanna spend your time on.
Awesome, happy to hear!
As for using a pre-installed htslib: Yeah, that would be nice to have, but would involve more than just deactivating the ExternalProject part in CMake. In addition, we'd need a way for CMake to find the htslib sources and binaries, so that they can be properly included and linked against... A quick search gave me this CMake script, which might do the trick, but that would need some more testing.
Actualy, your script is working fine, but yes, the path is not enough, I forgot to tell you about those 2 little replacements for htslib to be found in the right place. I'll do myself the changes into the Makefiles and suggest you the changes as a pull request.
Also, if I may ask: Would you be interested in packing gappa for NIX as well? I think it should be easier, as it does not need htslib, but otherwise be pretty much the same that you did already here. You could open a PR in gappa adding a badge and an entry in the Setup, so that your name shows up as a contributor ;-) - Just asking, as this seems it might be easy for you now, but of course I understand if that's not what you wanna spend your time on.
Sure! No problem, I'll be happy to do that!
Actualy, your script is working fine, but yes, the path is not enough, I forgot to tell you about those 2 little replacements for htslib to be found in the right place.
Ah I see, you are hard-replacing the path. Sure, hacky, but should work :-)
I'll do myself the changes into the Makefiles and suggest you the changes as a pull request.
Sounds good, thank you! Looking forward to improving the setup!
Sure! No problem, I'll be happy to do that!
That is great, thank you! Let me know if you have any trouble or I can help in any way!
Cheers and so long Lucas
About Gappa: seems indeed simple, but as nothing is always as simple as we think, there's a little problem with the name of the Nix package, as it is already assigned to another project: https://gappa.gforge.inria.fr
Can you suggest a name for the package, for example gappa-gen
?
Haha oh no, we are too late! I think bio-gappa
might work well, and be distinguishable and clear enough. Sounds good?
Perfect for me!
[bzizou@bart:~/git/nixpkgs]$ nix-build -A bio-gappa
/nix/store/ljk2g6ijqkl0w7m8wcmhby8nz15m5g0c-gappa-gen-0.8.4
[bzizou@bart:~/git/nixpkgs]$ ./result/bin/gappa --help
.... ....
'' '||. .||'
|| ||
'|.|'
...' .... ... ... ... ... .... .|'|.
| || '' .|| ||' || ||' || '' .|| .|' ||
|'' .|' || || | || | .|' || .|'|. ||
'.... '|..'|'. ||...' ||...' '|..'|. '||' ||:.
'....' || ||
'''' '''' v0.8.4 (c) 2017-2023
by Lucas Czech and Pierre Barbera
Usage: gappa [OPTIONS] SUBCOMMAND
Options:
--help FLAG Print this help message and exit.
--version FLAG Print the gappa version and exit.
Subcommands:
analyze Commands for analyzing and comparing placement data, that is, finding differences and patterns.
edit Commands for editing and manipulating files like jplace, fasta or newick.
examine Commands for examining, visualizing, and tabulating information in placement data.
prepare Commands for preparing and preprocessing of phylogenetic and placement data.
simulate Commands for random generation of phylogenetic and placement data.
tools Auxiliary commands of gappa.
gappa - a toolkit for analyzing and visualizing phylogenetic (placement) data
https://github.com/bzizou/nixpkgs/blob/gappa/pkgs/by-name/bi/bio-gappa/package.nix
Going to submit this as a PR upstream right now ;-)
Hello. Could you add a :+1: to https://github.com/NixOS/nixpkgs/pull/295944 for speed up the review process as they still not have reviewed my PR?
Haha, thanks, done! They currently have 5k open PRs :astonished: Let's hope someone reviews this soon!
Hey @bzizou,
is there anything else at the moment that we can or should do here, or shall we close this issue for now?
Cheers Lucas
Hello. Well, the package bio-gappa into Nixos has not yet been merged because a test seems to be stuck. I just added a comment to reactivate the reviewers. Anyway, regarding the current issue, yes, I think this can be closed and I'll open another if needed when I'll have time to go further into Nix integration (anyway we have an unofficial working package of grenedalf for now)
Sounds good, thanks for the quick reply! Yes, feel free to open another issue if there are any further problems down the line!
Hello,
I'm trying to create a NIX package of your tool (which is used in my computing center) but I'm facing an issue during compilation, and I can't understand why. I don't know if you have some NIX knowledge, but here are the sources of this wip packages (genesis and grenedalf):
https://github.com/bzizou/nixpkgs/commit/9a30cd2051967527ca7b77c341657f1621f0201a https://github.com/bzizou/nixpkgs/commit/1d366e582010857105f79943968b2fddd4509e0c
Note that NIX is providing a way to create dependencies with specific versions, like the commit hash of CLI11 for example. This replaces the "manual" build you made into
tools/cmake/IncludeHtslib.cmake
, that's why it has been disabled by a patch.The compiler used by default here is gcc 13.2, but I tried with Clang, gcc11, gcc8 and I always have the same error.
Here is the full trace of the error:
If you have any idea...