jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
346 stars 81 forks source link

restarting from step 10 #833

Closed timyerg closed 1 month ago

timyerg commented 2 months ago

Hello!

Currently, I am running SqueezMeta with 440 samples on the server. Since I have a lot of samples, my job was killed at the mapping step due to walltime limit:

  Working with sample 117: sample117
  Getting raw reads
  Aligning to reference with bowtie

So, I restarted SM with the command:

SqueezeMeta.pl -p $PROJECT --restart

In the logs, I noticed, that it stated:

Contig tax file /Results/Annotated_MAGs/intermediate/09.Annotated_MAGs.contiglog already found, skipping step 9
Mapping file /Results/Annotated_MAGs/intermediate/10.Annotated_MAGs.mapcount already found, skipping step 10

And then SM proceeded to further step, but processing only samples up to 117, skipping all the rest of samples.

I guess that script, as it is now, only checks the existence of the output file but does not check the progress, "thinking" that all samples were processed instead of starting from the last sample that was not processed.

I willtry to delete created intermediate files from step 10 and relaunch it.

Best,

timyerg commented 2 months ago

Update: After deleting intermediate files from step 10 and relaunching, it started again from the step 10, sample 1. But I am sure that it will face a time limit again since it will take more than 1 month doing it sample by sample. I will take a look into the scripts from SM to fix it in my local installation, then try to restore intermediate files and restart the job. But I am not familiar with perl , so it is unlikely that I will be able to fix it by myself.

fpusan commented 2 months ago

I will look into the script later to make sure it behaves correctly at restart. If you didn't remove the BAM files in the project/data/bam directory then they won't need to be generated again for those 116 samples that worked ok (only opened in order to count the reads mapping to each feature). So this should help a lot with moving the analysis forward even if your wall time limit keeps being the same.

timyerg commented 2 months ago

Ok, thank you! I did not removed them, so it should me much faster. Thank you for the response!

Working with sample 1: sample1
  Getting raw reads
  Aligning to reference with bowtie
  BAM file already found in /Results/Annotated_MAGs/data/bam/Annotated_MAGs.sample1.bam, skipping
  Calculating contig coverage
fpusan commented 1 month ago

Hope it went well, closing this, feel free to reopen if needed