amplab / snap

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data
https://www.microsoft.com/en-us/research/project/snap/
Apache License 2.0
287 stars 66 forks source link

Snap crashes during sorting step #164

Closed matthdsm closed 1 year ago

matthdsm commented 1 year ago

error

Loading index from directory... 13s.  3,100,314,541 bases, seed size 24.
Aligning.
sorting...read header failed
SNAP exited with exit code 1 from line 1293 of file SNAPLib/SortedDataWriter.cpp

version:

conda snap-aligner=2.0.2 on Linux

command:

snap-aligner paired ./snapaligner 1/FD2200256_DNA080266_1.fastp.fastq.gz 2/FD2200256_DNA080266_2.fastp.fastq.gz -o FD2200256_DNA080266.bam -t 18 -so -b- -sm 20 -I -hc- -S id -sa -R '@RG\tID:220623_A00785_0492_AH5TWVDRX2.2.2\tCN:CMGG\tPU:220623_A00785_0492_AH5TWVDRX2.2.AGCGCCAC-AAGACATT\tPL:ILLUMINA\tLB:CNV_LI_2022_088\tSM:FD2200256_DNA080266'

Info

Sorry, nothing much else to go on 🤷🏻 . This error occurred during the alignment of some shallow WGS data. I was able to fix it by downgrading the version to 2.0.1

Please let me know if I can provide more info.

Cheers M

bolosky commented 1 year ago

I haven’t been able to repro this. Did it happen more than once?

Is there anything unusual about either your reads or the index you’re using (like it’s got a zillion contigs or something)? Do you have only a tiny number of reads? The code path is different depending on whether the reads all fit in memory, so maybe that makes a difference (but I’ve tried both and they’re both working for me).

From: Matthias De Smet @.> Sent: Monday, January 30, 2023 4:09 AM To: amplab/snap @.> Cc: Subscribed @.***> Subject: [amplab/snap] Snap crashes during sorting step (Issue #164)

error

Loading index from directory... 13s. 3,100,314,541 bases, seed size 24.

Aligning.

sorting...read header failed

SNAP exited with exit code 1 from line 1293 of file SNAPLib/SortedDataWriter.cpp

version:

conda snap-aligner=2.0.2 on Linux

command:

snap-aligner paired ./snapaligner 1/FD2200256_DNA080266_1.fastp.fastq.gz 2/FD2200256_DNA080266_2.fastp.fastq.gz -o FD2200256_DNA080266.bam -t 18 -so -b- -sm 20 -I -hc- -S id -sa -R @.***\tID:220623_A00785_0492_AH5TWVDRX2.2.2\tCN:CMGG\tPU:220623_A00785_0492_AH5TWVDRX2.2.AGCGCCAC-AAGACATT\tPL:ILLUMINA\tLB:CNV_LI_2022_088\tSM:FD2200256_DNA080266'

Info

Sorry, nothing much else to go on 🤷🏻 . This error occurred during the alignment of some shallow WGS data. I was able to fix it by downgrading the version to 2.0.1

Please let me know if I can provide more info.

Cheers M

— Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Famplab%2Fsnap%2Fissues%2F164&data=05%7C01%7Cbolosky%40microsoft.com%7C11f999c0d53d4e916e4c08db02bad097%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638106773615881028%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Ph%2Fa2BMMQW%2BpsHw1z53LjOStilypzBa1X0XmkweBC4M%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAHPTWM2Z7SW2JW7EXI4KL3WU6VO5ANCNFSM6AAAAAAULCD4EU&data=05%7C01%7Cbolosky%40microsoft.com%7C11f999c0d53d4e916e4c08db02bad097%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638106773615881028%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xHaaBKg5TCC%2BUB8iGDu2IR8qbTvSDSGcNve342nJdMA%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

matthdsm commented 1 year ago

Hi, This occured with multiple low coverage samples (about 37M reads) and only with v2.0.2. We're using GRCh38 + decoy without alts, so I don't think the number of contigs are an issue?

Is there a verbose option I can try to get more output?

bolosky commented 1 year ago

This is weird, since nothing changed in that code path between 2.0.1 and 2.0.2. Or at least nothing obvious.

I created a new branch called issue164 with some instrumentation. Could you please build and run it and report the output? It should only be a few extra lines, but maybe it'll help me figure it out.

matthdsm commented 1 year ago

Hi Bill,

The branch says it's up to date with master. Did you push the changes?

bolosky commented 1 year ago

Oops. Try it now.

matthdsm commented 1 year ago

Was able to reproduce

Welcome to SNAP version 2.0.2.issue164.0.

Loading index from directory... 99s.  3,100,314,541 bases, seed size 24.
Aligning.
BAMFormat::writeHeader: headerActualSize 11913
sorting...read header failed, left 3909, headerSize 11913
AsyncFileDataReader (0x558509bc71b0) state:
        fileName FD2300072_DNA092919.bam.tmp, fileSize 11913, readOffset 12, endingOffset 11913
ReadBasedDataReader at 0x558509bc71b0 state:
        headerBufferSize 0, headerExtraSize 0, amountAdvancedThroughUnderlyingStoreByUs 0, nHeaderBuffersAllocated 0, hitEOFReadingHeader 0, bufferSize 8004
        nBuffers 2, headerBuffersOutstanding, 0, startedReadingHeader 0, extraBytes 0, overflowBytes 8000, nextBatchID 4, nextBufferForReader -1, nextBufferForConsumer 1, lastBufferForConsumer 0
SNAP exited with exit code 1 from line 1300 of file SNAPLib/SortedDataWriter.cpp
matthdsm commented 1 year ago

omitting -so fixes the issue, but I think that was a bit obvious looking at the error log 😅

bolosky commented 1 year ago

I got it to repro. It would only happen with a header size in a certain range and with an input small enough that it's all in memory when the sort starts.

While none of the code in this pathway changed between 2.0.1 and 2.0.2, I did change the default read length, which was (incorrectly) being passed in as a parameter while reading the header from the sort intermediate file to write it to the final BAM and that caused the problem.

I fixed it (at least it seems to work now).

Let me know if it works for you (still in the issue164 branch, version 2.0.3.issue164.7) and I'll do more testing before putting it in dev.

matthdsm commented 1 year ago

Can confirm this fixed the issue (for me). Thanks a lot!

Welcome to SNAP version 2.0.2.issue164.7.

Loading index from directory... 119s.  3,100,314,541 bases, seed size 24.
Aligning.
BAMFormat::writeHeader: headerActualSize 11629
sorting...SortedDataFilterSupplier::mergeSort(): allocating new header reader
SortedDataFilterSupplier::mergeSort(): got 7862 bytes
SortedDataFilterSupplier::mergeSort(): advancing 7862 bytes
SortedDataFilterSupplier::mergeSort(): calling nextBatch()
SortedDataFilterSupplier::mergeSort(): got 3767 bytes
SortedDataFilterSupplier::mergeSort(): advancing 3767 bytes
sorted 14569942 reads in 18 blocks, 28 s
Total Reads    Aligned, MAPQ >= 10    Aligned, MAPQ < 10     Unaligned              Too Short/Too Many Ns  %Pairs    Reads/s   Time in Aligner (s)
14,569,942     11,759,373 (80.71%)    905,082 (6.21%)        362,059 (2.48%)        1,543,428 (10.59%)     74.94%    423,939   34
bolosky commented 1 year ago

I put a slightly improved version of the fix with the instrumentation removed in the dev branch in 2.0.3.dev.1.

Let's leave this issue open until it makes it into master.

matthdsm commented 1 year ago

Hi! I can't find the branch you mentioned. I did however run a successful test with the latest changes in the issue164 branch.

Welcome to SNAP version 2.0.2.issue164.8.

Loading index from directory... 118s.  3,100,314,541 bases, seed size 24.
Aligning.

sorting...sorted 14569942 reads in 18 blocks, 34 s
Total Reads    Aligned, MAPQ >= 10    Aligned, MAPQ < 10     Unaligned              Too Short/Too Many Ns  %Pairs    Reads/s   Time in Aligner (s)
14,569,942     11,759,373 (80.71%)    905,082 (6.21%)        362,059 (2.48%)        1,543,428 (10.59%)     74.94%    301,880   48  

Thanks for the quick fix! Eagerly awaiting the new release 😄

bolosky commented 1 year ago

It's called "dev." Its the branch for staging changes that will go into master.

Anyway, the only difference between what's in dev and issue164.8 is the version number and one fixed typo, so you've effectively tried the latest version.

matthdsm commented 1 year ago

Ah my bad, I misread and was looking for a 2.0.3.dev.1 branch.

bolosky commented 1 year ago

This is in 2.0.3

matthdsm commented 1 year ago

Thank you!