Open ndreey opened 7 months ago
Lets begin with removing the .conda/envs
directory, variable and environment.
conda env remove --prefix /crex/proj/snic2020-6-222/Projects/Tconura/working/Andre/.conda/envs
I then create a bin
in my working directory in the project.
I follow the steps: https://hackmd.io/@pmitev/conda_on_Rackham
I choose to install miniforge3 here:
/crex/proj/snic2020-6-222/Projects/Tconura/working/Andre/miniforge3
After logging out, logging in, setting conda to initialize = false, i get these results!
andbou@rackham2: Andre: mamba activate
(base) andbou@rackham2: Andre: mamba doctor
Currently, only install, create, list, search, run, info, clean, remove, update, repoquery, activate and deactivate are supported through mamba.
(base) andbou@rackham2: Andre: mamba -V
mamba 1.5.7
conda 24.1.2
(base) andbou@rackham2: Andre: which mamba
/crex/proj/snic2020-6-222/Projects/Tconura/working/Andre/miniforge3/bin/mamba
(base) andbou@rackham2: Andre: which conda
/crex/proj/snic2020-6-222/Projects/Tconura/working/Andre/miniforge3/bin/conda
EUREKA, we have mamba installed!
I once again start a interactive session, activate mamba and then run:
mamba env create --file CONURA_WGS/doc/anvio-8.yaml
_Note: i am in my working directory and not the analysis directory CONURAWGS.
The command took about 8-10min.
(base) andbou@r412: Andre: mamba env list
# conda environments:
#
base * /crex/proj/snic2020-6-222/Projects/Tconura/working/Andre/miniforge3
anvio-8 /home/andbou/.conda/envs/anvio-8
Then we download anvio-8.tar.gz
and activate our new environment.
cd bin/
curl -L https://github.com/merenlab/anvio/releases/download/v8/anvio-8.tar.gz \
--output anvio-8.tar.gz
mamba activate anvio-8
cd ..
pip install anvio-8.tar.gz
Now lets set up databases! I started an interactive session with 4 cores for this.
# Generates the specific db folders
mkdir -p databases/{scg,ncbi-cogs}
# Setting up NCBI COG
anvi-setup-ncbi-cogs --cog-data-dir databases/ncbi-cogs/ -T 4 --cog-version COG20 --reset
# Setting up SCG taxonomy database (removing old files)
anvi-setup-scg-taxonomy -T 4 --scgs-taxonomy-data-dir databases/scg/ --reset
As KEGG is a much larger database, i used this script. get-kegg.sh
!/bin/bash
#SBATCH --job-name anvio-contigdb
#SBATCH -A naiss2024-22-580
#SBATCH -p core -n 6
#SBATCH -t 06:30:00
#SBATCH --output=slurm-logs/anvio/SLURM-%j-setup-kegg.out
#SBATCH --error=slurm-logs/anvio/SLURM-%j-setup-kegg.err
#SBATCH --mail-user=andbou95@gmail.com
#SBATCH --mail-type=ALL
# Start time and date
echo "$(date) [Start]"
# Activate the environment
mamba activate anvio-8
anvi-setup-kegg-data \
--mode all \
--kegg-data-dir ../databases/kegg \
-T 6 \
--reset
# End time and date
echo "$(date) [End]"
Anvio-8 (CHECK LAST COMMENT FOR BEST INSTALL)
Anvio is not available as a module on Rackham and has to be manually installed in ones
conda
environment. The steps to install Anvio was gathered from:Conda
As solving packages can be resourcessful i started an interactive session.
interactive -A naiss2023-22-412 -p core -n 2 -t 01:30:00
First step is to set the
$CONDA_ENVS_PATH
in your.bashrc
file.With that set, we can continue load in
conda
.I created a
.yaml
file with the required packages.anvio-8.yaml
Now, we can set up the environment. As i am not 100% sure how well
mamba
works on Rackham, i decided to useconda
even though it is slower.This command will tell where conda should install the environment and which packages.
conda env create --prefix $CONDA_ENVS_PATH/ --file doc/anvio-8.yaml
After a while it finally finished, although thanks to using
--prefix
the environment did not get a name.Now to install anvio.
And... SUCCESS !
However...
As mentioned before when setting up the environment with
--prefix
we dont get a name. Thus, giving this long environment name when activated.(/crex/proj/snic2020-6-222/Projects/Tconura/working/Andre/.conda/envs) andbou@r174: Andre:
Furthermore, i should have specified--prefix $CONDA_ENVS_PATH/anvio
as nowanvio
is installed in.conda/envs
and not its own directory. Eitherway, this could be made shorter if i set the$CONDA_ENVS_PATH
to earlier directory in the project. Or, if i were to installconda
ormamba
accordingly to: https://hackmd.io/@pmitev/conda_on_Rackham.Also, when seeing what BINNERs exist, anvio cant find any, even though CONCOCT for example was specified in the environment.
If i
load module bioinfo-tools
andload module CONCOCT/1.1.0
we now get this error.Unloading CONCOCT resolves this error.
Databases
There was no issue downloading the databases and setting them up. I will have to generate DB directories for each of them.