BioInfoTools / BBMap

(Not Offical) BBMap short read aligner, and other bioinformatic tools.
Other
213 stars 111 forks source link

clumpify hanging #15

Open nick-youngblut opened 6 years ago

nick-youngblut commented 6 years ago

It appears that running clumpify in an SGE job with no enough memory causes an "Exception in Thread" error, but clumpify doesn't die. The process just hangs and continuously waits for all threads. Here's the full log from one run:

java -ea -Xmx60g -Xms60g -cp /ebio/abt3_projects/software/dev/llmgqc/.snakemake/conda/72fe9c49/opt/bbmap-37.78/current/ clump.Clumpify -Xmx60g in=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R1.fq.gz in2=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R2.fq.gz out=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R1_dedup.fq.gz out2=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R2_dedup.fq.gz overwrite=t usetmpdir=t tmpdir=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/ dedupe=t dupedist=2500 optical=t
Executing clump.Clumpify [-Xmx60g, in=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R1.fq.gz, in2=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R2.fq.gz, out=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R1_dedup.fq.gz, out2=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R2_dedup.fq.gz, overwrite=t, usetmpdir=t, tmpdir=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/, dedupe=t, dupedist=2500, optical=t]
Version 37.78 [-Xmx60g, in=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R1.fq.gz, in2=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R2.fq.gz, out=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R1_dedup.fq.gz, out2=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R2_dedup.fq.gz, overwrite=t, usetmpdir=t, tmpdir=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/, dedupe=t, dupedist=2500, optical=t]

Read Estimate:          10080760
Memory Estimate:        7691 MB
Memory Available:       48242 MB
Set groups to 1
Executing clump.KmerSort [in1=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R1.fq.gz, in2=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R2.fq.gz, out1=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R1_dedup.fq.gz, out2=/tmp/global2/nyoungblut/LLMGQC_27982106400/NO08/0/1/R2_dedup.fq.gz, groups=1, ecco=false, rename=false, shortname=f, unpair=false, repair=false, namesort=false, ow=true, dedupe=t]

Making comparator.
Made a comparator with k=31, seed=1, border=1, hashes=4
Starting cris 0.
Fetching reads.
Making fetch threads.
Starting threads.
Waiting for threads.
Exception in thread "Thread-9" Exception in thread "Thread-15" Exception in thread "Thread-12" Exception in thread "Thread-11" java.lang.AssertionError: SRR1761740.1 1 length=100
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash_inner(KmerComparator.java:79)
    at clump.KmerComparator.hash(KmerComparator.java:70)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:815)
Exception in thread "Thread-10" Exception in thread "Thread-13" java.lang.AssertionError: SRR1761740.1201 1201 length=100
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash_inner(KmerComparator.java:79)
    at clump.KmerComparator.hash(KmerComparator.java:70)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:815)
Exception in thread "Thread-16" java.lang.AssertionError: SRR1761740.601 601 length=31
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash_inner(KmerComparator.java:79)
    at clump.KmerComparator.hash(KmerComparator.java:70)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:815)
java.lang.AssertionError: SRR1761740.1001 1001 length=29
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash_inner(KmerComparator.java:79)
    at clump.KmerComparator.hash(KmerComparator.java:70)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:815)
java.lang.AssertionError: SRR1761740.801 801 length=31
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash_inner(KmerComparator.java:79)
    at clump.KmerComparator.hash(KmerComparator.java:70)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:815)
Exception in thread "Thread-14" java.lang.AssertionError: SRR1761740.401 401 length=39
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash_inner(KmerComparator.java:79)
    at clump.KmerComparator.hash(KmerComparator.java:70)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:815)
java.lang.AssertionError: SRR1761740.1401 1401 length=100
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash_inner(KmerComparator.java:79)
    at clump.KmerComparator.hash(KmerComparator.java:70)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:815)
java.lang.AssertionError: SRR1761740.201 201 length=71
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash_inner(KmerComparator.java:79)
    at clump.KmerComparator.hash(KmerComparator.java:70)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:815)
Fetch time:     0.080 seconds.
Closing input stream.
Combining thread output.
Combine time:   0.000 seconds.
Exception in thread "main" java.lang.AssertionError: 0, 3200, true
    at clump.KmerSort.fetchReads(KmerSort.java:720)
    at clump.KmerSort.processInner(KmerSort.java:398)
    at clump.KmerSort.process(KmerSort.java:310)
    at clump.KmerSort.main(KmerSort.java:51)
    at clump.Clumpify.process(Clumpify.java:243)
    at clump.Clumpify.main(Clumpify.java:37)
Jtrachsel commented 6 years ago

I seem to be having a similar issue running clumpify.sh. This is happening on a compute node with 125G ram and 40 processors using a SLURM workload manager. This is the logfile:

java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
java -ea -Xmx53714m -Xms53714m -cp /software/7/apps/bbtools/37.02/current/ clump.Clumpify in=SRR4434844_temp.fq.gz out=SRR4434844_clumped.fq.gz dedupe optical
Executing clump.Clumpify [in=SRR4434844_temp.fq.gz, out=SRR4434844_clumped.fq.gz, dedupe, optical]

Clumpify version 37.66
Read Estimate:          7231912
Memory Estimate:        5517 MB
Memory Available:       42171 MB
Set groups to 1
Executing clump.KmerSort [in1=SRR4434844_temp.fq.gz, in2=null, out1=SRR4434844_clumped.fq.gz, out2=null, groups=1, ecco=false, rename=false, shortname=f, unpair=false, repair=false, namesort=false, ow=true, dedupe]

Making comparator.
Made a comparator with k=31, seed=1, border=1, hashes=4
Starting cris 0.
Fetching reads.
Making fetch threads.
Starting threads.
Waiting for threads.
Exception in thread "Thread-23" Exception in thread "Thread-16" Exception in thread "Thread-19" Exception in thread "Thread-24" Exception in thread "Thread-33" Exception in thread "Thread-40" Exception in thread "Thread-11" Exception in thread "Thread-5" Exception in thread "Thread-44" Exception in thread "Thread-21" Exception in thread "Thread-7" Exception in thread "Thread-30" java.lang.AssertionError: SRR4434844.3401 3401 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.3001 3001 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2701 2701 length=248
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2301 2301 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.401 401 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.3901 3901 length=250
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2501 2501 length=250
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1301 1301 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-10" java.lang.AssertionError: SRR4434844.3701 3701 length=250
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-8" java.lang.AssertionError: SRR4434844.3501 3501 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-9" java.lang.AssertionError: SRR4434844.3801 3801 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-17" Exception in thread "Thread-27" java.lang.AssertionError: SRR4434844.2801 2801 length=250
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-39" Exception in thread "Thread-43" java.lang.AssertionError: SRR4434844.201 201 length=249
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-42" Exception in thread "Thread-20" Exception in thread "Thread-6" Exception in thread "Thread-41" Exception in thread "Thread-38" Exception in thread "Thread-35" Exception in thread "Thread-31" java.lang.AssertionError: SRR4434844.1201 1201 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-22" java.lang.AssertionError: SRR4434844.2201 2201 length=250
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-14" Exception in thread "Thread-28" Exception in thread "Thread-36" Exception in thread "Thread-32" Exception in thread "Thread-37" java.lang.AssertionError: SRR4434844.901 901 length=249
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-26" java.lang.AssertionError: SRR4434844.1701 1701 length=250
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-12" java.lang.AssertionError: SRR4434844.3301 3301 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-15" java.lang.AssertionError: SRR4434844.3101 3101 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-18" java.lang.AssertionError: SRR4434844.2601 2601 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-29" java.lang.AssertionError: SRR4434844.1801 1801 length=249
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-25" Exception in thread "Thread-34" java.lang.AssertionError: SRR4434844.1101 1101 length=149
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-13" java.lang.AssertionError: SRR4434844.2101 2101 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.3601 3601 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1601 1601 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.101 101 length=102
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.3201 3201 length=185
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2001 2001 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1401 1401 length=250
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1001 1001 length=250
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1501 1501 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2901 2901 length=164
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.801 801 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.701 701 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.601 601 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1 1 length=196
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2401 2401 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.301 301 length=251
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.501 501 length=46
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1901 1901 length=250
    at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
    at clump.ReadKey.<init>(ReadKey.java:46)
    at clump.ReadKey.<init>(ReadKey.java:33)
    at clump.ReadKey.makeKey(ReadKey.java:23)
    at clump.KmerComparator.hash(KmerComparator.java:73)
    at clump.KmerComparator.hash(KmerComparator.java:66)
    at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Fetch time:     0.216 seconds.
Closing input stream.
Combining thread output.
Combine time:   0.000 seconds.
Exception in thread "main" java.lang.AssertionError: 0, 8000, false
    at clump.KmerSort.fetchReads(KmerSort.java:718)
    at clump.KmerSort.processInner(KmerSort.java:400)
    at clump.KmerSort.process(KmerSort.java:320)
    at clump.KmerSort.main(KmerSort.java:51)
    at clump.Clumpify.process(Clumpify.java:247)
    at clump.Clumpify.main(Clumpify.java:37)
slhogle commented 5 years ago

Spent like the last hour dealing with nearly the same cryptic error message, and I think it has to do with the parsing of fastq headers for the removal of optical duplicates.

from the manual optical=f If true, mark or remove optical duplicates only. This means they are Illumina reads within a certain distance on the flowcell. Normal Illumina names needed. Also for tile-edge and well duplicates.

I was trying to remove duplicates in an ncbi SRA download that didn't have the original illumina header names/formatting. Once I tried this on raw reads with illumina formatting that I'd gotten directly from the sequencing center the error message went away.

I didn't really do any other testing so I could be wrong here, but my explanation makes sense since illumina headers contain tiling coordinates. hopefully this helps someone else

millerh1 commented 4 years ago

I had the same error -- it would be nice if there was a way to use clumpify on SRA reads for optical dedup