sanger-tol / treeval

Pipelines for the production of Treeval data
https://pipelines.tol.sanger.ac.uk/treeval
Other
21 stars 2 forks source link

singularity pull failed #311

Closed gitforp closed 1 month ago

gitforp commented 1 month ago

Description of the bug

Hello, I encountered an error while running the sample data. I am a beginner and do not know how to resolve this issue. Could you please provide some suggestions? Best regards.

Command used and terminal output

I followed this guide for the operation:
cd ${TREEVAL_TEST_DATA}
curl https://tolit.cog.sanger.ac.uk/test-data/resources/treeval/TreeValTinyData.tar.gz | tar xzf -

sed -i "s|/home/runner/work/treeval/treeval|${TREEVAL_TEST_DATA}|" TreeValTinyData/gene_alignment_data/fungi/csv_data/LaetiporusSulphureus.gfLaeSulp1-data.csv
sed -i "s|/home/runner/work/treeval/treeval|${TREEVAL_TEST_DATA}|" assets/github_testing/TreeValTinyTest.yaml
nextflow run main.nf -profile test_github,singularity

Relevant files

Here is the log of the error section: Error executing process > 'SANGERTOL_TREEVAL:TREEVAL:TELO_FINDER:FIND_TELOMERE_REGIONS (1)'

Caused by: Failed to pull singularity image command: singularity pull --name docker.io-library-gcc-10.4.0.img.pulling.1721706928541 docker://docker.io/library/gcc:10.4.0 > /dev/null status : 255 hint : Try and increase singularity.pullTimeout in the config (current is "15d") message: INFO: Converting OCI blobs to SIF format INFO: Starting build... Copying blob sha256:bec9bd27e2ca4b14e1d084ebc29dbb593e2c399d0631c41481240738452c9d08 Copying blob sha256:85d8cbbec380aac89db19d5f5ea119e6c3181b9f7171277e88d888a22588f323 Copying blob sha256:f3f8721393bc605f2b915d80eb2ad6d5219db374f36bbd1fee99b99174a0a4ca Copying blob sha256:fa786a946ae67fa18e07eaf82fefee1777449f7db1a8fea5abec1aadbe99e2ef Copying blob sha256:6fdd0e5b72ccae203ec30d533c0bcd34200af90265e0531c66356812e529af32 Copying blob sha256:34df401c391c7595044379e04e8ad4856a5a3974cdbf5a160f0a204d761e88aa Copying blob sha256:1103310eb9e4435003a57cd3a744d5ed65d9ffe46494a78a442feb76b083a7d4 Copying blob sha256:ba55b4fbc92018b697878f2cb3ca4ab616d8c94c0f1cf771d65b028b36a61760 FATAL: While making image from oci registry: error fetching image to cache: while building SIF from layers: conveyor failed to get: while fetching image: initializing source oci:/home/xia/.apptainer/cache/blob:1191805e24188d5a98067c2f6e1191c128a8610005343f8186472eee8184ef45: copying system image from manifest list: reading blob sha256:34df401c391c7595044379e04e8ad4856a5a3974cdbf5a160f0a204d761e88aa: Get "https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/34/34df401c391c7595044379e04e8ad4856a5a3974cdbf5a160f0a204d761e88aa/data?verify=1721709936-4yPmiaeISlBWOCXJX4aIskNEeIk%3D": dial tcp 65.49.26.97:443: i/o timeout

System information

Nextflow v24.04.3 sanger-tol/treeval v1.1.1

DLBPointon commented 1 month ago

Hi @gitforp

What are the specs of your machine and how are you running TreeVal? It could be that it is not getting enough memory to download and convert the images into SIF. If you are using something like SLURM or LSF then this should be trivial. How many times did you run this? I'd suggest running it again and if possible bumping the memory for the nextflow head job.

If this remains an issue then I suggest pre-downloading the containers, you can do this via:

nf-core download sanger-tol/treeval --revision 1.1.1 --compress none -d --force --outdir sanger-treeval --container-cache-utilisation amend --container-system singularity

As long as you have nf-core tools and singularity installed this should work.

You can then run the TreeVal pipeline as you normally would.

I hope this helps!

gitforp commented 1 month ago

Hi @DLBPointon

Thank you very much for your help! My machine has sufficient memory, and I believe the error is caused by a network connection problem. I ran the code you provided on another machine with a good network connection, and most of the downloads were successful. However, the downloads for depot.galaxyproject.org-docker.io-library-gcc-10.4.0.img, depot.galaxyproject.org-sanger-tol-cramfilter_bwamem2_minimap2_samtools_perl-0.001-c1.img, depot.galaxyproject.org-sanger-tol-fastk-1.0.1-c1.img, and depot.galaxyproject.org-sanger-tol-pretext-0.0.2-yy5-c3.img failed. I manually downloaded them one by one.

At this point, I believe all the necessary content has been downloaded.

Then, I ran the pipeline using the command: nextflow run treeval/ --input assets/treeval.yaml -profile singularity,sanger

Unfortunately, the program encountered an error again: ERROR ~ Error executing process > 'SANGERTOL_TREEVAL:TREEVAL:REPEAT_DENSITY:WINDOWMASKER_MKCOUNTS (ihAphGlyc_1)' Caused by: java.io.IOException: Cannot run program "bsub" (in directory "/data/xia/treeval/rundir/work/c6/3f014d1e80a0708abeb4ca5726181e"): error=2, 没有那个文件或目录 Command executed: bsub Command exit status:

Command output: (empty) Work dir: /data/xia/treeval/rundir/work/c6/3f014d1e80a0708abeb4ca5726181e

I don't know why this is happening, so I tried changing the code: nextflow run treeval/ --input assets/treeval.yaml -profile singularity

Fortunately, this time it ran successfully and gave the following feedback in the end: -[sanger-tol/treeval] Pipeline completed successfully- Completed at: 28-Jul-2024 06:20:00 Duration : 1h 29m 35s CPU hours : 55.0 Succeeded : 2'649

I am unsure if running the pipeline with the modified code will impact the results. May I kindly ask for your thoughts on this matter? I look forward to your esteemed guidance.

Thank you once again for your assistance!

DLBPointon commented 1 month ago

Hi @gitforp

Apologies for the late reply, I haven't been able to work this week.

You have done the right thing! I'm unsure why those containers would have been difficult to download though, the same line of code downloads everything for our CICD testing.

The profiles change the way in which the pipeline is executed, it will not change anything with the output of the pipeline. The sanger pipeline has been designed to make the pipeline more efficient on the LSF system that the Sanger HPC uses for compute work. This explains why your initial run failed with a BSUB error, you likely don't use LSF. This may be something you can look into in the future for your own HPC system if you continue using Nextflow pipelines, you can find more profile configs here: https://nf-co.re/configs/.

Your output will be fine! Congrats on running the pipeline, if you need anything else then let us know!