Daniel-Liu-c0deb0t / UMICollapse

Accelerating the deduplication and collapsing process for reads with Unique Molecular Identifiers (UMI). Heavily optimized for scalability and orders of magnitude faster than a previous tool.
MIT License
62 stars 8 forks source link

umi collapse fails with fastq input #1

Closed gfialk closed 3 years ago

gfialk commented 4 years ago

umicollapse is failing whan using fastq input. The command I'm running is: umicollapse fastq -i in.fastq.gz -o out.fastq.gz and I'm getting the following error: Exception in thread "main" java.lang.ClassCastException: class umicollapse.util.FASTQRead cannot be cast to class umicollapse.util.SAMRead (umicollapse.util.FASTQRead and umicollapse.util.SAMRead are in unnamed module of loader 'app') at umicollapse.merge.MapQualMerge.merge(MapQualMerge.java:9) at umicollapse.main.DeduplicateFASTQ.deduplicateAndMerge(DeduplicateFASTQ.java:53) at umicollapse.main.Main.main(Main.java:166)

Daniel-Liu-c0deb0t commented 4 years ago

That's because FASTQ reads do not have a mapping quality, so the default mapping quality merging scheme cannot be used. Try adding --merge avgqual to the command-line arguments.

Daniel-Liu-c0deb0t commented 4 years ago

A better warning (or a different default setting) probably should be provided, so this needs to be worked on.

gfialk commented 4 years ago

Hi Daniel.

Thanks so much for your quick and helpfull response. I was wandering if there was any way of getting the deduplication statistics when I collapse a fastq file with UMICollapse? (something similar to the --output-stats of umi-tools)

Best, Gavriel

‫בתאריך יום ג׳, 24 במרץ 2020 ב-2:49 מאת ‪Daniel Liu‬‏ <‪ notifications@github.com‬‏>:‬

A better warning (or a different default setting) probably should be provided, so this is an open issue.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Daniel-Liu-c0deb0t/UMICollapse/issues/1#issuecomment-602938709, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGFYN4MH4HEZFNYNEYTJCNTRI77TFANCNFSM4LSHIE2Q .

Daniel-Liu-c0deb0t commented 4 years ago

Currently, there is no way to do so. Of course, this should not be too hard to implement as a optional flag, like UMI-tools. However, I chose to not implement it because efficiency must be sacrificed in order to collect extra statistics.

Daniel-Liu-c0deb0t commented 3 years ago

When deduplicating FASTQ files, merging using average quality scores is now the default, so this issue is resolved.

gfialk commented 3 years ago

Great! Thanks for the update

On 17 Oct 2020, at 9:00, Daniel Liu notifications@github.com wrote:

When deduplicating FASTQ files, merging using average quality scores is now the default, so this issue is resolved.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Daniel-Liu-c0deb0t/UMICollapse/issues/1#issuecomment-710757347, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGFYN4MO23A5DN2ZEOSXGKTSLEXJBANCNFSM4LSHIE2Q.