Open nick-youngblut opened 6 years ago
I seem to be having a similar issue running clumpify.sh. This is happening on a compute node with 125G ram and 40 processors using a SLURM workload manager. This is the logfile:
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
java -ea -Xmx53714m -Xms53714m -cp /software/7/apps/bbtools/37.02/current/ clump.Clumpify in=SRR4434844_temp.fq.gz out=SRR4434844_clumped.fq.gz dedupe optical
Executing clump.Clumpify [in=SRR4434844_temp.fq.gz, out=SRR4434844_clumped.fq.gz, dedupe, optical]
Clumpify version 37.66
Read Estimate: 7231912
Memory Estimate: 5517 MB
Memory Available: 42171 MB
Set groups to 1
Executing clump.KmerSort [in1=SRR4434844_temp.fq.gz, in2=null, out1=SRR4434844_clumped.fq.gz, out2=null, groups=1, ecco=false, rename=false, shortname=f, unpair=false, repair=false, namesort=false, ow=true, dedupe]
Making comparator.
Made a comparator with k=31, seed=1, border=1, hashes=4
Starting cris 0.
Fetching reads.
Making fetch threads.
Starting threads.
Waiting for threads.
Exception in thread "Thread-23" Exception in thread "Thread-16" Exception in thread "Thread-19" Exception in thread "Thread-24" Exception in thread "Thread-33" Exception in thread "Thread-40" Exception in thread "Thread-11" Exception in thread "Thread-5" Exception in thread "Thread-44" Exception in thread "Thread-21" Exception in thread "Thread-7" Exception in thread "Thread-30" java.lang.AssertionError: SRR4434844.3401 3401 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.3001 3001 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2701 2701 length=248
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2301 2301 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.401 401 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.3901 3901 length=250
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2501 2501 length=250
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1301 1301 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-10" java.lang.AssertionError: SRR4434844.3701 3701 length=250
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-8" java.lang.AssertionError: SRR4434844.3501 3501 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-9" java.lang.AssertionError: SRR4434844.3801 3801 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-17" Exception in thread "Thread-27" java.lang.AssertionError: SRR4434844.2801 2801 length=250
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-39" Exception in thread "Thread-43" java.lang.AssertionError: SRR4434844.201 201 length=249
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-42" Exception in thread "Thread-20" Exception in thread "Thread-6" Exception in thread "Thread-41" Exception in thread "Thread-38" Exception in thread "Thread-35" Exception in thread "Thread-31" java.lang.AssertionError: SRR4434844.1201 1201 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-22" java.lang.AssertionError: SRR4434844.2201 2201 length=250
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-14" Exception in thread "Thread-28" Exception in thread "Thread-36" Exception in thread "Thread-32" Exception in thread "Thread-37" java.lang.AssertionError: SRR4434844.901 901 length=249
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-26" java.lang.AssertionError: SRR4434844.1701 1701 length=250
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-12" java.lang.AssertionError: SRR4434844.3301 3301 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-15" java.lang.AssertionError: SRR4434844.3101 3101 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-18" java.lang.AssertionError: SRR4434844.2601 2601 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-29" java.lang.AssertionError: SRR4434844.1801 1801 length=249
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-25" Exception in thread "Thread-34" java.lang.AssertionError: SRR4434844.1101 1101 length=149
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Exception in thread "Thread-13" java.lang.AssertionError: SRR4434844.2101 2101 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.3601 3601 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1601 1601 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.101 101 length=102
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.3201 3201 length=185
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2001 2001 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1401 1401 length=250
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1001 1001 length=250
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1501 1501 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2901 2901 length=164
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.801 801 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.701 701 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.601 601 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1 1 length=196
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.2401 2401 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.301 301 length=251
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.501 501 length=46
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
java.lang.AssertionError: SRR4434844.1901 1901 length=250
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:51)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash(KmerComparator.java:73)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread.run(KmerSort.java:816)
Fetch time: 0.216 seconds.
Closing input stream.
Combining thread output.
Combine time: 0.000 seconds.
Exception in thread "main" java.lang.AssertionError: 0, 8000, false
at clump.KmerSort.fetchReads(KmerSort.java:718)
at clump.KmerSort.processInner(KmerSort.java:400)
at clump.KmerSort.process(KmerSort.java:320)
at clump.KmerSort.main(KmerSort.java:51)
at clump.Clumpify.process(Clumpify.java:247)
at clump.Clumpify.main(Clumpify.java:37)
Spent like the last hour dealing with nearly the same cryptic error message, and I think it has to do with the parsing of fastq headers for the removal of optical duplicates.
from the manual
optical=f If true, mark or remove optical duplicates only. This means they are Illumina reads within a certain distance on the flowcell. Normal Illumina names needed. Also for tile-edge and well duplicates.
I was trying to remove duplicates in an ncbi SRA download that didn't have the original illumina header names/formatting. Once I tried this on raw reads with illumina formatting that I'd gotten directly from the sequencing center the error message went away.
I didn't really do any other testing so I could be wrong here, but my explanation makes sense since illumina headers contain tiling coordinates. hopefully this helps someone else
I had the same error -- it would be nice if there was a way to use clumpify on SRA reads for optical dedup
It appears that running
clumpify
in an SGE job with no enough memory causes an "Exception in Thread" error, but clumpify doesn't die. The process just hangs and continuously waits for all threads. Here's the full log from one run: