Illumina / DRAGMAP

DRAGEN open-source mapper
Other
156 stars 31 forks source link

Aligning to chm13v2 #33

Open knightjimr opened 2 years ago

knightjimr commented 2 years ago

I am able to get dragmap running to align reads to hg38, but when I try to align to the new chm13v2 human reference, the build looks like it works correctly, but the alignment output gives the results below, and then all of the reads are output as unmapped. It looks like the index reading is getting short-circuited (compared to the hg38 alignment output). If you need it, attached is the log of the build stdout (there was a stderr message saying "Suppressing decoy"). Both dragen-os commands matched your main page commands.

Should dragmap be able to run using this reference?

build.log

2022-04-22 09:45:55 [2af3bc142f40] Version: 1.2.1 2022-04-22 09:45:55 [2af3bc142f40] argc: 7 argv: dragen-os -r ref/ -1 ../Unaligned/NA12878-S1_AH5WLCDMXX_L002_R1_00 1.50m.fastq.gz -2 ../Unaligned/NA12878-S1_AH5WLCDMXX_L002_R2_001.50m.fastq.gz decompHashTableCtxInit... 0.773 seconds decompHashTableHeader... Running dual fastq workflow on 20 threads. System supports 20 threads. 0 249 0 0 0 0 10000 1 40000 1 1000 0 0 0 6 0 250 0 0 0 0 10000 1 40000 1 1000 0 0 0 5 0 251 0 0 0 0 10000 1 40000 1 1000 0 0 0 4 0 252 0 0 0 0 10000 1 40000 1 1000 0 0 0 3 0 253 0 0 0 0 10000 1 40000 1 1000 0 0 0 2 0 254 0 0 0 0 10000 1 40000 1 1000 0 0 0 1 0 0 0 0 0 0 10000 1 40000 1 1000 0 0 0 0 Initial paired-end statistics detected for read group all, based on 0 high quality pairs for FR orientation Quartiles (25 50 75) = 0 0 0 Mean = 0 Standard deviation = 10000 Rescue radius = 0 Effective rescue sigmas = 0 WARNING: Default rescue sigmas value of 2.5 was overridden by host software! The user may wish to set a rescue sigmas value explicitly with --Aligner.rescue-sigmas Boundaries for mean and standard deviation: low = 0, high = 0 Boundaries for proper pairs: low = 1, high = 40000 NOTE: DRAGEN's insert estimates include corrections for clipping (so they are not identical to TLEN)

kokyriakidis commented 2 years ago

I am not able to map to chm13v2 too

gconcepcion commented 2 years ago

I also just ran into this issue. pipeline works fine w/ hg38, but if I switch to chm13v2 I get no reads mapped. I thought I was going nuts, reassuring to see that I'm in good company.

tienpm7723 commented 2 years ago

@knightjimr Could you give me your result for HG38, i have the same problem even i run HG38 and this is my output: decompHashTableCtxInit... 1.220 seconds decompHashTableHeader... 0.002 seconds decompHashTableLiterals... 3.367 seconds decompHashTableExtIndex... 0.070 seconds decompHashTableAutoHits... Running fastq workflow on 88 threads. System supports 88 threads. 0 249 0 0 0 0 10000 1 40000 1 1000 0 0 0 6
0 250 0 0 0 0 10000 1 40000 1 1000 0 0 0 5
0 251 0 0 0 0 10000 1 40000 1 1000 0 0 0 4
0 252 0 0 0 0 10000 1 40000 1 1000 0 0 0 3
0 253 0 0 0 0 10000 1 40000 1 1000 0 0 0 2
0 254 0 0 0 0 10000 1 40000 1 1000 0 0 0 1

knightjimr commented 2 years ago

For hg38, I get the following initial log messages (I was using a file that only contained 50 million reads, hence the numbers). After your lines, there should be the "Initial paired-end statistic" lines that come up quickly, then a wait for the summary stats.

decompHashTableCtxInit... 0.777 seconds decompHashTableHeader... 0.002 seconds decompHashTableLiterals... 1.782 seconds decompHashTableExtIndex... 0.033 seconds decompHashTableAutoHits... 55.169 seconds decompHashTableSetFlags... 6.004 seconds finished decompress Running dual fastq workflow on 10 threads. System supports 20 threads. 0 249 0 0 0 0 10000 1 40000 1 1000 0 0 0 6 0 250 0 0 0 0 10000 1 40000 1 1000 0 0 0 5 0 251 0 0 0 0 10000 1 40000 1 1000 0 0 0 4 0 252 0 0 0 0 10000 1 40000 1 1000 0 0 0 3 0 253 0 0 0 0 10000 1 40000 1 1000 0 0 0 2 0 254 0 0 0 0 10000 1 40000 1 1000 0 0 0 1 0 0 217 254 296 257.157 59.9087 1 533 107 407 84497 84930 0 0 Initial paired-end statistics detected for read group all, based on 84497 high quality pairs for FR orientation Quartiles (25 50 75) = 217 254 296 Mean = 257.157 Standard deviation = 59.9087 Rescue radius = 149.772 Effective rescue sigmas = 2.5 Boundaries for mean and standard deviation: low = 59, high = 454 Boundaries for proper pairs: low = 1, high = 533 NOTE: DRAGEN's insert estimates include corrections for clipping (so they are not identical to TLEN) MAPPING/ALIGNING SUMMARY,,Total input reads,50000000,100.00 MAPPING/ALIGNING SUMMARY,,Number of duplicate marked reads,0,0.00 MAPPING/ALIGNING SUMMARY,,Number of duplicate marked and mate reads removed,0,0.00 MAPPING/ALIGNING SUMMARY,,Number of unique reads (excl. duplicate marked reads),50000000,100.00 MAPPING/ALIGNING SUMMARY,,Reads with mate sequenced,50000000,100.00 MAPPING/ALIGNING SUMMARY,,Reads without mate sequenced,0,0.00 MAPPING/ALIGNING SUMMARY,,QC-failed reads,0,0.00 MAPPING/ALIGNING SUMMARY,,Mapped reads,49952500,99.90

obigbando commented 2 years ago

Same situation met on release 1.3.1. What we did:

Same error occurs for chrm13v2.0_noY, but not on GRCH38, which simply works just fine.

mazin128 commented 2 years ago

I am facing the same issue when I aligning to chm13v2.

Arlaz commented 2 years ago

Same issue here, can't map to chm13v2

TomofumiSaka commented 1 year ago

I found a fix for this bug. Please refer to this pull request.

https://github.com/Illumina/DRAGMAP/pull/55