Open CFGrote opened 5 months ago
We thank you for this helpful advice. We will rework the installation method to match the best practices. We will create a conda channel that allows to install the dependencies that were out of channels conda-forge
and bioconda
and that were managed by the setup.sh
script until now. This way IntronSeeker will be easily installed by consuming the environment.yml
file.
We will check if using an up to date python version don't break IntronSeeker and update it if possible.
@CFGrote We'll get back to you as soon as it's done.
any progress on this one? Just to make sure I did not miss any notification or updates. thanks!
any progress on this one? Just to make sure I did not miss any notification or updates. thanks!
Thank you for your message, we're still working on it.
INSTALL.md says:
all these dependencies and their versions are detailed in the file requirements.txt
however, there is no file in the repo by this name.
I followed the new installation instructions but the sthe command
▶ mamba env create -f config/environment.yml
Retrieving notices: ...working... done
intronseeker/linux-64 979.0 B @ 2.0kB/s 0.5s
intronseeker/noarch 643.0 B @ 874.0 B/s 0.3s
bioconda/linux-64 5.6MB @ 4.5MB/s 1.5s
pkgs/r/linux-64 1.9MB @ 1.2MB/s 0.8s
bioconda/noarch 5.2MB @ 3.2MB/s 1.9s
pkgs/main/noarch 875.0kB @ 387.1kB/s 0.6s
pkgs/r/noarch 2.3MB @ 946.6kB/s 1.3s
conda-forge/noarch 18.9MB @ 5.2MB/s 4.5s
pkgs/main/linux-64 7.3MB @ 1.6MB/s 3.4s
conda-forge/linux-64 44.2MB @ 4.2MB/s 13.1s
Looking for: ['intronseeker::assemblathon_stats', "bcftools[version='>=1.9']", "biopython[version='>=1.74']", 'diamond==0.9.24', "emboss[version='>=6.6.0']", 'gffread==0.11.4', "gtfparse[version='>=1.2.0']", 'intronseeker::grinder==0.5.4', 'hisat2==2.1.0', "numpy[version='>=1.16.4']", "pandas[version='>=0.25.0']", "perl-test-more[version='>=1.001002']", "perl-test-warn[version='>=0.36']", "perl-bioperl-core[version='>=1.007002']", "perl-list-moreutils[version='>=0.428']", "perl-math-random[version='>=0.72']", "plotly[version='>=4.0.0']", "pysam[version='>=0.15.3']", "python[version='>=3.6.9']", 'samtools==1.9', 'star==2.7.1a', "scipy[version='>=1.4.1']", 'transdecoder==5.5.0']
never returns. I aborted after >30 minutes. Is this the normal behaviour?
Tried again in a two-step approach: 1) create the environment
conda create -n ISeeker_environment 'python>=3.6' -c bioconda -c conda-forge
conda activate ISeeker_environment
2) Update with
conda env update -n ISeeker_environment -f config/enviroment.yml
and that runs through in reasonable time.
Thank you for pointing out the long installation time.
We find out that older conda versions are slow when installing intronSeeker envionment. An updated version of conda does not have this issue.
For example, conda==24.7.1 with conda-libmamba-solver==24.7.0 resolves dependencies in less than 2 minutes, when conda==23.3.1 with conda-libmamba-solver==23.3.0 resolves the same depencies in more than 40 minutes.
We will recommend to update conda base environment to lastest version in the documentation. In addition, we will create container image that can be used with apptainer/singularity and docker.
IntronSeeker uses old versions of some softwares. We decided to stick to those versions for the sake of reproducibility of this publication.
In details, resolving dependencies is slow because it must be used with channel_priority=flexible
option. If we use the channel_priority=strict
option recommended by bioconda setup, the resolution of dependencies fails: at least one dependencies, satisfying the environment constrains, only exist in bioconda, but is ignored by solver because the same dependency exists in conda-forge, but without satisfying the constrains (See conda documentation for details)
Some dependencies used by intronSeeker were published before the option channel_priority=strict
was introduced into conda, when bioconda duplicated many libraries from conda-forge. Some were deindexed from conda-forge since but not from bioconda.
We use this command to measure how much time conda used to resolve intronSeeker dependencies
time conda env create --dry-run --solver libmamba -f config/environment.yml
JOSS Review Issue
During reading, installation and testing, I came across the following items that would be nice to see fixed but not at top priority.
Installation:
The authors did a great job in wrapping most of the common installation steps in the setup.sh script. However, I'd recommend following community best-practises in providing an environment.yml file in the project root and instructions how to consume it by
conda install
, respectively. Leave the details of how to create an environment, using mamba or conda, override channels, etc. to the user. E.g. runningfailled for me, likely because my conda setup is somewhat non-standard. This may be true for other users too.
Python version:
python3.6 has reached end-of-life, I'd suggest bumping the minimum version to 3.8 (see https://devguide.python.org/versions/)