jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
379 stars 80 forks source link

--restart is not running #882

Closed AntonioPuriel closed 2 months ago

AntonioPuriel commented 2 months ago

I'm trying to restart my process because my job has expired, and when I want to restart the process, it doesn't happen and kills it immediately.

!/usr/bin/env bash

PBS -N SqueezeMeta

PBS -q omp

PBS -l mem=250G

PBS -l walltime=220:00:00

PBS -l ncpus=36

PBS -e myscript.err

PBS -o myscript.out

PBS -j oe

PBS -M apurielh@gmail.com

cd /home3/datawork/apurielh/MetagenOIL/runpool090715_2/Project_MetagenOIL.489/Run_pool.7386/

activate environment

. /appli/bioinfo/squeezemeta/1.6.3/env.sh

First run :

SqueezeMeta.pl -m coassembly -p metagenomic_1 -s SampleID -f RawData -miniden 50 -t 36 -b 2 -taxbinmode c+s -map bowtie -binner concoct,maxbin,metabat2 --cleaning -cleaning_options MINLEN:20 >& SqueezeMeta.log 2>&1

SqueezeMeta.pl -p metagenomic_1 --restart -step 5

syslog.txt SqueezeMeta.txt

fpusan commented 2 months ago

Can you try restarting inside an interactive session? Does it work there? If not, what's the output?

AntonioPuriel commented 2 months ago

PBS Job Id: 158579.datarmor0 Job Name: SqueezeMeta Execution terminated Exit_status=0 resources_used.cpupercent=0 resources_used.cput=00:00:03 resources_used.mem=68908kb resources_used.ncpus=36 resources_used.vmem=9824kb resources_used.walltime=00:00:03

AntonioPuriel commented 2 months ago

Can't open samples file (-s) in metagenomic_1/data/00..samples. Please check that it is the correct file (/appli/conda-env/bioinfo/squeezemeta-1.6.3) apurielh@r1i4n1:~>

fpusan commented 2 months ago

Are you launching it from the right directory?

AntonioPuriel commented 2 months ago

Can't find sample file /home3/datawork/apurielh/MetagenOIL/runpool090715_2/Project_MetagenOIL.489/Run_pool.7386/metagenomic_1/data/raw_fastq/poolHC_T0_ATCACG_L008_R1.fastq.gz for sample HC_T0 in the samples file. Please check Captura de pantalla (209)

fpusan commented 2 months ago

It seems that at least one of the fastq.gz files that were specified in the original samples file when you first ran SqueezeMeta is absent from the fastq directory that you provided with -f when you first ran SqueezeMeta. So now SqueezeMeta is complaining when trying to restart. Can you check whether the original fastq folder and file are still present in the original locations?

AntonioPuriel commented 2 months ago

ok, in the -f RawData are 32 files Captura de pantalla (210)

fpusan commented 2 months ago

Please paste terminal outputs directly instead of posting screenshots. Your terminal is cropped in the screenshot so I can not see the full paths. Just copy the full path of the file mentioned in the SqueezeMeta error and run ls /full/path/to/the/file.fastq.gz

AntonioPuriel commented 2 months ago

ok, sorry Project_MetagenOIL.489/Run_pool.7386% cd RawData/ total 244134144 -rw------- 1 apurielh bioinfo 8054989280 Jul 30 21:34 pool_Anox_ACTTGA_L002_R1.fastq.gz -rw------- 1 apurielh bioinfo 8966725155 Jul 30 21:33 pool_Anox_ACTTGA_L002_R2.fastq.gz -rw------- 1 apurielh bioinfo 7898851933 Jul 30 21:44 pool_Anox_ACTTGA_L003_R1.fastq.gz -rw------- 1 apurielh bioinfo 8842014006 Jul 30 21:46 pool_Anox_ACTTGA_L003_R2.fastq.gz -rw------- 1 apurielh bioinfo 6838219049 Jul 30 21:54 pool_Anox_ACTTGA_L007_R1.fastq.gz -rw------- 1 apurielh bioinfo 7925382352 Jul 30 21:58 pool_Anox_ACTTGA_L007_R2.fastq.gz -rw------- 1 apurielh bioinfo 5111247588 Jul 30 22:00 pool_Anox_ACTTGA_L008_R1.fastq.gz -rw------- 1 apurielh bioinfo 5629723453 Jul 30 22:05 pool_Anox_ACTTGA_L008_R2.fastq.gz -rw------- 1 apurielh bioinfo 9241913848 Jul 30 20:48 poolHC_T0_ATCACG_L002_R1.fastq.gz -rw------- 1 apurielh bioinfo 9384442078 Jul 30 20:49 poolHC_T0_ATCACG_L002_R2.fastq.gz -rw------- 1 apurielh bioinfo 9073797733 Jul 30 21:02 poolHC_T0_ATCACG_L003_R1.fastq.gz -rw------- 1 apurielh bioinfo 9276819966 Jul 30 21:02 poolHC_T0_ATCACG_L003_R2.fastq.gz -rw------- 1 apurielh bioinfo 7402944867 Jul 30 21:13 poolHC_T0_ATCACG_L007_R1.fastq.gz -rw------- 1 apurielh bioinfo 7739752696 Jul 30 21:13 poolHC_T0_ATCACG_L007_R2.fastq.gz -rw------- 1 apurielh bioinfo 5635895165 Jul 30 21:21 poolHC_T0_ATCACG_L008_R1.fastq.gz -rw------- 1 apurielh bioinfo 5499022450 Jul 30 21:21 poolHC_T0_ATCACG_L008_R2.fastq.gz -rw------- 1 apurielh bioinfo 7476639160 Jul 30 22:11 pool_Osci_TAGCTT_L002_R1.fastq.gz -rw------- 1 apurielh bioinfo 8345005587 Jul 30 22:17 pool_Osci_TAGCTT_L002_R2.fastq.gz -rw------- 1 apurielh bioinfo 7384479208 Jul 30 22:20 pool_Osci_TAGCTT_L003_R1.fastq.gz -rw------- 1 apurielh bioinfo 8290026471 Jul 30 22:28 pool_Osci_TAGCTT_L003_R2.fastq.gz -rw------- 1 apurielh bioinfo 8348790241 Jul 30 22:33 pool_Osci_TAGCTT_L007_R1.fastq.gz -rw------- 1 apurielh bioinfo 9699279724 Jul 30 22:41 pool_Osci_TAGCTT_L007_R2.fastq.gz -rw------- 1 apurielh bioinfo 6272814542 Jul 30 22:41 pool_Osci_TAGCTT_L008_R1.fastq.gz -rw------- 1 apurielh bioinfo 6927008694 Jul 30 22:50 pool_Osci_TAGCTT_L008_R2.fastq.gz -rw------- 1 apurielh bioinfo 7508679638 Jul 30 22:51 pool_Oxic_GGCTAC_L002_R1.fastq.gz -rw------- 1 apurielh bioinfo 8375457459 Jul 30 23:02 pool_Oxic_GGCTAC_L002_R2.fastq.gz -rw------- 1 apurielh bioinfo 7345678118 Jul 30 23:01 pool_Oxic_GGCTAC_L003_R1.fastq.gz -rw------- 1 apurielh bioinfo 8234414240 Jul 30 23:15 pool_Oxic_GGCTAC_L003_R2.fastq.gz -rw------- 1 apurielh bioinfo 8215241397 Jul 30 23:14 pool_Oxic_GGCTAC_L007_R1.fastq.gz -rw------- 1 apurielh bioinfo 9656745036 Jul 30 23:29 pool_Oxic_GGCTAC_L007_R2.fastq.gz -rw------- 1 apurielh bioinfo 7248678164 Jul 30 23:25 pool_Oxic_GGCTAC_L008_R1.fastq.gz -rw------- 1 apurielh bioinfo 8139216212 Jul 30 23:38 pool_Oxic_GGCTAC_L008_R2.fastq.gz

sorry this I don't get it "Just copy the full path of the file mentioned in the SqueezeMeta error and run" ls /full/path/to/the/file.fastq.gz

fpusan commented 2 months ago

There is an error message like Can't find sample file and then a path to a file. So something like Can't find sample file /full/path/to/the/file.fastq.gz

I would like to run the following command ls /full/path/to/the/file.fastq.gz

AntonioPuriel commented 2 months ago

Can't find sample file /home3/datawork/apurielh/MetagenOIL/runpool090715_2/Project_MetagenOIL.489/Run_pool.7386/metagenomic_1/data/rapoolHC_T0_ATCACG_L008_R1.fastq.gz for sample HC_T0 in the samples file. Please check

Ok because in the SqueezeMeta project it disappeared but it is still in the rawData folder.

ls /home3/datawork/apurielh/MetagenOIL/runpool090715_2/Project_MetagenOIL.489/Run_pool.7386/RawData/poolHC_T0_ATCACG_L008_R1.fastq.gz

fpusan commented 2 months ago

So the file is still present in the original location? Did the ls command show above find the file?

AntonioPuriel commented 2 months ago

Yes it is present in the original file, but in the SqueezeMeta folder is not.

fpusan commented 2 months ago

Now I think understand the issue. When adding the --cleaning flag, the filtered fastqs are stored in the project/data/raw_fastq folder. After restarting, step 10 complains that it can not find the files in there, you checked manually and indeed they were missing.

I suspect the files got deleted when trying to restart, but I have not been able to reproduce the issue in my own computer when trying to restart a previous project with the --cleaning mode, even when using the same SqueezeMeta version as you (1.6.3).

Can you try copying the original files to the project/data/raw_fastq and restarting? They are supposed to have the same names so it should be fine. If they get deleted by SqueezeMeta then step 10 should complain again.

fpusan commented 2 months ago

I see you closed this. Did you manage to find a solution?

AntonioPuriel commented 2 months ago

I closed it because I started the job again, I could not solve it.