ncbi / magicblast

34 stars 16 forks source link

truncated file error #59

Open jeremysutherland opened 3 weeks ago

jeremysutherland commented 3 weeks ago

Code:

magicblast -query A16_S4_4_S22_S23_R1.fastq.gz -query_mate A16_S4_4_S22_S23_R2.fastq.gz -db USDA1106_reference -num_threads 24 -infmt fastq | samtools sort -n | samtools view -bS > A16_S4_4_S22_S23.nsort.bam

Error suggests a truncation error in the sam file:

[W::sam_read1_sam] Parse error at line 4991269
samtools sort: truncated file. Aborting
[main_samview] fail to read the header from "-".

Generating the sam file without the pipe:


tail -n 1 A16_S4_4_S22_S23.sam
VH00707:155:2222VWGNX:2:1315:15697:42799    141 *   0   0   *   *   0   0   AGAGAGTGCAACATCCCGATTTTTAATATTATTCTACTAGTATTATTACACTACAAGAATTTATTTAATTAGTGACAAATTCAAAAT(magicblast)

Notice the '(magicblast)' text at the end of the file, which is also line 4991269.

Any suggestions?

boratyng commented 3 weeks ago

Hi @jeremysutherland, I am sorry you ran into problems. The truncated output may be a result of a crash, maybe because magicblast is running out of memory. The first suggestion would be to reduce number of threads to fewer than 10. It will decrease magicblast's memory footprint.

It would be helpful if you could answer a few questions:

  1. Which version of magicblast are you using (magicblast -version)?
  2. Is magicblast installed from NCBI FTP site (https://ftp.ncbi.nlm.nih.gov/blast/executables/magicblast/LATEST/), Bioconda, other?
  3. What is the operating system?
  4. Is there a way you could share the reads and reference databases? I would like to try to recreate the problem.

Thanks!

jeremysutherland commented 3 weeks ago

1) magicblast -version magicblast: 1.6.0 Package: magicblast 1.7.0, build Oct 26 2022 20:38:15

2) Install via bioconda

3) NAME="Red Hat Enterprise Linux" VERSION="8.9 (Ootpa)" ID="rhel" ID_LIKE="fedora" VERSION_ID="8.9" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux 8.9 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8" BUG_REPORT_URL="https://bugzilla.redhat.com/"

4) Shared link with Fastqs and Ref. Database: https://www.dropbox.com/scl/fi/khv8nkn8uzadudi2o99lk/troubleshoot.zip?rlkey=7mjydqicn5ef1w46ganjm8k3v&st=6hcg6dr6&dl=0

boratyng commented 2 weeks ago

@jeremysutherland,

Thank you for the information and the data. I was not able to reproduce your problem. My best guess is that magicblast may have run out of memory during the run. Did you try using fewer threads? Did you see any error message from magicblast?

jeremysutherland commented 2 weeks ago

Thanks for following up. I'm providing 64gb of ram. I can try reducing the number of threads and see if that solves the issue for me. I'll get back to you if I come with a solution.