uab-cgds-worthey / quac

🦆 Quality Control of WGS and exome samples 🦆
https://quac.readthedocs.io
GNU General Public License v3.0
5 stars 1 forks source link

--conda-create-envs-only results in errors - CreateCondaEnvironmentException and container creation failed #68

Open ManavalanG opened 1 year ago

ManavalanG commented 1 year ago

Copy/pasting from source: https://github.com/openjournals/joss-reviews/issues/5313#issuecomment-1497107263

Thanks for pointing me to the script, I was able to download the required reference files!

However, I run into a crash when I try to run the pipeline itself, following along to the documentation.

 python src/run_quac.py       --project_name test_project       --projects_path ".test/ngs-data/"       --pedigree ".test/configs/${PRIOR_QC_STATUS}/${PROJECT_CONFIG}.ped"       --outdir "$USER_SCRATCH/tmp/quac/results/test_${PROJECT_CONFIG}_exome-${PRIOR_QC_STATUS}/analysis"       --quac_watch_config "configs/quac_watch/exome_quac_watch_config.yaml"       --exome       $USE_SLURM       -e="--conda-create-envs-only"
########################################
Command to run the pipeline:
snakemake \
    --snakefile '/home/username/devel/quac/workflow/Snakefile' \
    --config project_name='test_project' projects_path='/home/username/devel/quac/.test/ngs-data' ped='/home/username/devel/quac/.test/configs/no_priorQC/project_2samples.ped' quac_watch_config='/home/username/devel/quac/configs/quac_watch/exome_quac_watch_config.yaml' workflow_config='/home/username/devel/quac/configs/workflow.yaml' unique_id='2ad58efa-8557-4807-b85d-ed1d4752ff2f' out_dir='/home/username/devel/quac/scratch/tmp/quac/results/test_project_2samples_exome-no_priorQC/analysis' log_dir='/home/username/devel/quac/scratch/tmp/quac/logs' exome='True' include_prior_qc_data='False' allow_sample_renaming='False' \
    --restart-times 1 \
    --use-conda \
    --use-singularity \
    --singularity-args '--cleanenv --bind /home/username/devel/quac/scratch/tmp/quac/tmp:/tmp --bind /home/username/devel/quac/.test/ngs-data/test_project/analysis,/home/username/devel/quac/scratch/tmp/quac/logs,/home/username/devel/quac/scratch/tmp/quac/results/test_project_2samples_exome-no_priorQC/analysis,data/external/dependency_datasets/somalier,data/external/dependency_datasets/verifyBamID,data/external/dependency_datasets/reference_genome,/home/username/devel/quac/.test/configs/no_priorQC,/home/username/devel/quac/configs/quac_watch/exome_quac_watch_config.yaml,data/external/dependency_datasets/verifyBamID/exome' \
    --profile '/home/username/devel/quac/src/slurm/slurm_profile' \
    --conda-create-envs-only
########################################

Slurm job name   : "quac-2023-04-05T10.10.34.732874"
Slurm job script : "/home/username/devel/quac/scratch/tmp/quac/logs/quac-2023-04-05T10.10.34.732874.sh"
// Processing project: test_project
// Project path: "/home/username/devel/quac/.test/ngs-data/test_project/analysis"
// Exome mode: True
// Include prior QC data: False
// WARNING: '.test' present in the path supplied via --projects_path. So testing mode is used.
Building DAG of jobs...
Pulling singularity image docker://continuumio/miniconda3:4.7.12.
Pulling singularity image docker://brentp/somalier:v0.2.13.
Creating conda environment configs/env/quac_watch.yaml...
Downloading and installing remote packages.
CreateCondaEnvironmentException:
Could not create conda environment from /home/username/devel/quac/configs/env/quac_watch.yaml:
FATAL:   container creation failed: unable to add /home/username/devel/quac/data/external/dependency_datasets/somalier to mount list: destination must be an absolute path

  File "/home/username/miniconda3/envs/quac/lib/python3.6/site-packages/snakemake/deployment/conda.py", line 389, in create
Traceback (most recent call last):
  File "src/run_quac.py", line 372, in <module>
    main(ARGS)
  File "src/run_quac.py", line 202, in main
    submit_slurm_job(pipeline_cmd, job_dict)
  File "/home/username/devel/quac/src/slurm/submit_slurm_job.py", line 60, in submit_slurm_job
    job_id = slurm_job.run(**params_dict)
  File "/home/username/miniconda3/envs/quac/lib/python3.6/site-packages/slurmpy/slurmpy.py", line 171, in run
    res = subprocess.check_output(args).strip()
  File "/home/username/miniconda3/envs/quac/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/home/username/miniconda3/envs/quac/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['bash', '/home/username/devel/quac/scratch/tmp/quac/logs/quac-2023-04-05T10.10.34.732874.sh']' returned non-zero exit status 1.
wilkb777 commented 1 year ago

@Redmar-van-den-Berg this is a tricky issue, it's a case of Snakemake wanting to create a Singularity image using Anaconda inside the image to build an environment and it's trying to mount the somalier data directory as part of the process.

Can you tell me if any part of the path /home/username/devel/quac is a symlink or some sort of non-absolute path on your setup? Since this is an issue of trying to mount (or bind) a host path to the inside of the container and the error says the path looks like an absolute path but singularity doesn't think so this issue sounds a lot like what a user was seeing in this SO post.

Perhaps you could run

realpath ~

and tell us if the path it returns is the same as /home/username? On the HPC system I have access to the home directory is not truly /home/username it ends up looking like this

username@login004:~$ pwd
/home/username

username@login004:~$ realpath ~
/data/user/home/username

Knowing that info would be helpful in determining if moving the repo to a non-home directory location would help.

In the past I've seen issues with Singularity and mounting paths in a user's home directory on top of the typical default mount that Singularity does where it auto-mounts a users home directory automatically. It shouldn't be an issue but for reasons I never figured out in the past it ended up being resolved by working out of a non-home directory location. I'm not sure that this is the case here but might be something to try.

Redmar-van-den-Berg commented 1 year ago

As far as I can tell, my home directory is a regular path:

$ realpath ~
/home/username

Rather than trying to debug this, would it be possible for QuaC to switch over to using singularity images instead of conda? Conda environments tend to become uninstallable over time, which means pipelines using them are not fully reproducible.

I tend to use the Biocontainers hosted on quay.io, which are built automatically for all conda packages (e.g. https://quay.io/repository/biocontainers/goleft, which is available as docker://quay.io/biocontainers/goleft:0.2.4--0). It is also possible to automatically build multi package containers if you need multiple tools in a single workflow step.

wilkb777 commented 1 year ago

That's a good idea, I know this is the preference of setup for Nextflow NF-Core based pipelines and @ManavalanG had dealt with a scenario in #52 that had prompted the thought to move move away completely from conda + singularity to just singularity. I'm reviewing #70 right now as part of this effort.

I'm still really perplexed by this issue where Singularity can't mount a directory into the container. @Redmar-van-den-Berg if you have time before #70 is reviewed and merged (and this is totally at your discretion b/c it's more debugging and more potentially unnecessary work on you) could you sanity check the mounting setup for Singularity?

If you have time could you quickly check that the lolcow testing container can bind those mount paths and print out the cow to the terminal?

The command would be

cd /home/username/devel/quac
singularity run --cleanenv --bind /home/username/devel/quac/scratch/tmp/quac/tmp:/tmp --bind /home/username/devel/quac/.test/ngs-data/test_project/analysis,/home/username/devel/quac/scratch/tmp/quac/logs,/home/username/devel/quac/scratch/tmp/quac/results/test_project_2samples_exome-no_priorQC/analysis,data/external/dependency_datasets/somalier,data/external/dependency_datasets/verifyBamID,data/external/dependency_datasets/reference_genome,/home/username/devel/quac/.test/configs/no_priorQC,/home/username/devel/quac/configs/quac_watch/exome_quac_watch_config.yaml,data/external/dependency_datasets/verifyBamID/exome docker://sylabsio/lolcow

The only thing I could think is something is preventing the singularity command from correctly binding relative paths, in particular data/external/dependency_datasets/somalier since it's the one giving the error and the first relative bind path listed in your log command. If that works fine and prints out the cow like

 ______________________________
< Mon Apr 10 15:14:40 CDT 2023 >
 ------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

then I have no idea the issue and hope that just moving strictly to containers resolves whatever issue is coming up with the combo of conda + singularity 😂 . I completely understand if your preference is to just wait for the PR switching to full containers is merged and test that setup directly.

ManavalanG commented 1 year ago

@Redmar-van-den-Berg I would like to inform that we have refactored QuaC to move away from creation of conda environment within singularity containers. QuaC now simply depends on webserver-hosted container images, retrieves them automatically as part of the pipeline and uses them for snakemake-initiated jobs (#70). I hope this makes it easier to execute the pipeline. Please let us know if you run into any issues :)

Redmar-van-den-Berg commented 1 year ago

The issue still happens for me, even if I run the latest version of the pipeline that does not use Conda.

I think it is related to the snakemake command attempting to bind mount a whole bunch of folders that do not exist:

--singularity-args '--cleanenv --bind /data/lumc/devel/quac/data/quac/tmp:/tmp --bind /data/lumc/devel/quac/configs/quac_watch/wgs_quac_watch_config.yaml,data/external/dependency_datasets/somalier,/data/lumc/devel/quac/.test/configs/no_priorQC,data/external/dependency_datasets/reference_genome,data/external/dependency_datasets/verifyBamID/exome,/data/lumc/devel/quac/data/quac/results/test_project_2samples_wgs-no_priorQC/analysis,/data/lumc/devel/quac/data/quac/logs,/data/lumc/devel/quac/.test/ngs-data/test_project/analysis,data/external/dependency_datasets/verifyBamID
ManavalanG commented 1 year ago

@Redmar-van-den-Berg Sorry you are running into issues. Our suspicion is that certain path(s) mounted to singularity do not exist in your system for some reason. To help debug this issue, we added a feature in the CLI wrapper script verifying the input paths exist as expected (#72). We would appreciate if you could pull the changes and then try out the pipeline. Hopefully this helps resolve the issue :)

If you are still running into issues, could you provide the following info? It would greatly help us debug the issue.

PS- Please be sure to use latest version of the docs. This can be chosen by clicking version button on the bottom right in the docs website.