jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
346 stars 81 forks source link

sqm_hmm_reads core dump error #758

Closed Tetrajf closed 3 months ago

Tetrajf commented 7 months ago

Hi there

Thank you for the wonderful pipeline. Been running it recently for many different projects. One of these projects is for Eukaryotes but the DNA it wasn't enriched for Eukaryotes so we're not really identifying much that we are interested in. I'm trying to get the sqm_hmm_reads.pl to work but it gives me a segmentation fault core dumped error. I've installed SqueezeMeta with a conda environment. I tried running Short-pair directly from its sourceforge repository with python 2.7 but it give many errors. For whatever reason it has lots of issues with using os and sys. It will not accept any commands (I had to, for example, modify the commands to point to the full path of DNA2Protein but then it still errors with get_hmm etc so all paths had to be hard-coded but still would not work). I see you have modified the Short-pair script for python 3 but unfortunately it still gives me errors. I don't believe the segmentation fault is due to lack of memory. I'm running it with 192GB of memory via slurm on an HPC. I tested the script with only a few hundred kb of fasta sequences just to see and it still gave a segmentation fault. I've tested the import of all the python dependencies are they all load perfectly.

Here is content of the error file: 2023-11-28 11:39:21 URL:http://pfam-legacy.xfam.org/family/PF00091/hmm [11867/11867] -> "PF00091.hmm" [1] 2023-11-28 11:39:22 URL:http://pfam-legacy.xfam.org/family/PF00091/alignment/seed [11867/11867] -> "PF00091.seed" [1] Segmentation fault (core dumped) Segmentation fault (core dumped)

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.1.frame1 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.1.frame2 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.1.frame3 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.1.frame4 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.1.frame5 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.1.frame6 for reading

Error: Failed to open sequence file /scratch/sysuser/jonathan/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.2.frame1 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.2.frame2 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.2.frame3 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.2.frame4 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.2.frame5 for reading

Error: Failed to open sequence file /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/Cow.R1.fastatemp.2.frame6 for reading

Traceback (most recent call last): File "/scratch/sysuser/j/.conda/envs/SqueezeMeta2/SqueezeMeta/bin/Short-Pair/Short-Pair.py", line 887, in control(options.fastaFile1, options.fastaFile2, options.hmmFile, options.seedFile, options.threshold, options.outputFile) File "/scratch/sysuser/j/.conda/envs/SqueezeMeta2/SqueezeMeta/bin/Short-Pair/Short-Pair.py", line 863, in control part1(fastaName, fastaFile1, fastaFile2, pattern1, pattern2, hmmFile, step4OutputFile1, step4OutputFile2, step5OutputFile) File "/scratch/sysuser/j/.conda/envs/SqueezeMeta2/SqueezeMeta/bin/Short-Pair/Short-Pair.py", line 816, in part1 step3OutputList.append(step3(step3Input)) File "/scratch/sysuser/j/.conda/envs/SqueezeMeta2/SqueezeMeta/bin/Short-Pair/Short-Pair.py", line 72, in step3 ExtractHMMER(inputFile, outputFile) File "/scratch/sysuser/j/.conda/envs/SqueezeMeta2/SqueezeMeta/bin/Short-Pair/Short-Pair.py", line 49, in ExtractHMMER with open(inputFile, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: '/scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Protein/../out1/Cow.R1.fastatemp.1.frame1.hmmer' rm: cannot remove 'alldomains.allframe': No such file or directory rm: cannot remove 'fragment_length*': No such file or directory rm: cannot remove 'hmms.sav': No such file or directory rm: cannot remove 'pfam.seed.sav': No such file or directory rm: cannot remove 'HMMs': No such file or directory rm: cannot remove 'faaSP': No such file or directory rm: cannot remove 'fastaSP': No such file or directory

It seems as if it still will not create folders like Protein/ and it perhaps still does not want to run commands like DNA2Protein (I have tried just making the Protein folder before running).

My actual command in a slurm script after activating the conda environment is: sqm_hmm_reads.pl -pfam PF00091 -pair1 /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Cow.R1.fasta -pair2 /scratch/sysuser/j/ops/ANNETTE/FASTAS_from_FASTQS/Cow.R2.fasta -output ANNETTE_PF0091

Would deeply appreciate your assistance thank you. Kind Regards JF

jtamames commented 7 months ago

Hello! I was checking that script recently, and I was not able to make it work with the latest versions. Perhaps could you try to use it with older versions, like 1.4? Best, J

Tetrajf commented 7 months ago

Hi J

Sure I'll give that a try thank you!

Kind Regards, JF

fpusan commented 3 months ago

Closing due to lack of activity, feel free to reopen