Closed mjherre1 closed 3 years ago
Hi! Difficult to say what is causing the error based on the log files...I suppose it might have something to do with the available RAM memory, but I am not sure. To check if the pipeline is properly configured/installed, you could run the analysis on a small subset.
cd /home/centos/USS/mjh_minION/
mkdir subset
for f in $(find /home/centos/USS/mjh_minION/Met_step | grep "\\.fastq\\.gz"); do
sn=$(echo $(basename $f | sed 's/\.fastq\.gz//'));
seqtk sample $f 1000 | gzip > "subset/"$sn".fastq.gz";
done
nohup /home/centos/USS/mjh_minION/Met_step/MetONTIIME.sh /home/centos/USS/mjh_minION/subset /home/centos/USS/mjh_minION/subset/sample-metadata.tsv /home/centos/USS/mjh_minION/Met_step/silva_132_99_16S_sequence.qza /home/centos/USS/mjh_minION/Met_step/silva_132_99_16S_taxonomy.qza 5 Vsearch 3 0.8 0.85 &
Simone
Hello Simone,
Thank you for your quick response and for your suggestion! I ran the analysis on a small subset with the commands as you suggested, and the pipeline ran to completion with files with data in the collapsed feature table directory! It also had a lot more files than what was previously outputted in my original run, with a table.qzv, taxonomy.qzv, and all the feature-table tsv files. I have attached the nohup.out file. Is it properly configured? In terms of RAM memory, we have 150GB of RAM available, and I believe the original run took up about 35GB of that, but not completely sure.
Thank you very much for your help!! nohup.out.txt
Great, everything worked! So I suggest doing the analysis with 100k reads per sample (or less, depending on the minimum number of reads for each sample), by changing 1000 to 100000 in the above code and subsample to subsample_100k. You can see the number of reads for each sample by uploading demux_summary.qzv file obtained with the full dataset to qiime2 viewer. Simone
I am going to close the issue, as it probably was a memory issue! Please, let me know if you succeed running the analysis with 100k reads per sample! Best, Simone
Hi Simone,
Thank you for the suggestion! I am glad that it is installed properly. I will try running on a subset of 100k reads. The read count for each sample is pretty high, the minimum is 400k, so I concur that unfortunately it might be a memory issue. I will let you know if the 100K succeeds. Thank you for your time and help! I appreciate it.
Cheers, Michelle
Hi Simone,
I believe the run with a subset of 100k reads completed successfully with the output files! I have attached the nohup.out file here. Thank you for the suggestion of subsampling, as it appears it was a memory issue. Thank you again for your help and time!
Cheers,
Michelle nohup.out.txt
Perfect! Ciao! Simone
Hi Simone,
Thank you very much for providing a great pipeline and sharing your scripts! I’ve been trying the MetONTIIME pipeline a few times now and am reaching a few errors that I have been trying to fix by reading through the other issues posted. I have sequences that have already been basecalled using guppy, and followed your guidelines, so that there is one file per sample (total of 12 samples). I get the pipeline to run and it ran for about two weeks using vsearch, but then it came up with an error in the nohup.out file. Through the course of the job, it did output the following files: sequences.qza, table_tmp.qza, and rep-seqs_tmp.qza, demux_summary.qzv and a directory with empty collapsed feature tables. I notice in the error file it might have something to do with the temporary directory. However, when I go to the temporary directory that I assigned outside of my working directory, I could not find the log file, so I am unable to share it with you. However, I will share the nohup.out file, and the manifest and metadata files that the program generated, in case there are other issues with my command or files.
The command that I ran:
nohup ./MetONTIIME.sh /home/centos/USS/mjh_minION/Met_step /home/centos/USS/mjh_minION/Met_step/sample-metadata.tsv /home/centos/USS/mjh_minION/Met_step/silva_132_99_16S_sequence.qza /home/centos/USS/mjh_minION/Met_step/silva_132_99_16S_taxonomy.qza 5 Vsearch 3 0.8 0.85 &
Where “Met_step” contains the sequences as well as the script MetONTIIME.sh and the sequence and taxonomy files.
I am running on a cloud compute system that should have enough space to run this analysis (it uses about 35GB of the 116GB available). And I assigned a temporary directory outside of this working directory, using “export TMPDIR=‘/home/centos/USS/cw-temp’” and echo $TMPDIR to check.
Thank you for any guidance that you can offer, I really appreciate your time!!
manifest.txt nohup.out.txt