Closed jenzopr closed 6 years ago
Can you check the index values of your cell and UMI barcodes?
Your assumption is right, there is a problem with read1. I'm not sure what though.
how long is your read1? Are they all the same length? Did you run a trimming algo on read1?
Maybe check out your fastqc on read1.
The code is in the public folder in dropseqtools
private int scoreBaseQuality(final SAMRecord barcodedRead, final
List
if (q < this.BASE_QUALITY || s=='N')
numBasesBelowQuality++;
}
From what I understand, qual[i] is out of range, meaning maybe the quality line is missing some values.
Could you send a few lines of your read1?
2018-02-27 11:23 GMT+01:00 Jens Preußner notifications@github.com:
Hi Patrick,
we finally have our first dropSeq data sequenced and I tried to run in through the dropSeqPipe. Unfortunately, TagBamWithReadSequenceExtended terminates after a short while giving:
(dropSeqPipe) jens@KI-V0205:/data/dropTest/dropSeqPipe$ drop-seq-tools-wrapper.sh -m 60g -t ./tmp/ -p TagBamWithReadSequenceExtended SUMMARY=logs/test_CELL_barcode.txt BASE_RANGE=1-12 BASE_QUALITY=30 BARCODED_READ=1 DISCARD_READ=false TAG_NAME=XC NUM_BASES_BELOW_QUALITY=1 INPUT=data/test_unaligned.bam OUTPUT=data/test_BC_tagged_unmapped.bam
- java -Xmx60g -Djava.io.tmpdir=./tmp/ -jar /mnt/software/x86_64/packages/dropSeqTools/1.13/jar/dropseq.jar TagBamWithReadSequenceExtended SUMMARY=logs/test_CELL_barcode.txt BASE_RANGE=1-12 BASE_QUALITY=30 BARCODED_READ=1 DISCARD_READ=false TAG_NAME=XC NUM_BASES_BELOW_QUALITY=1 INPUT=data/test_unaligned.bam OUTPUT=data/test_BC_tagged_unmapped.bam [Tue Feb 27 11:17:01 CET 2018] org.broadinstitute.dropseqrna.utils.TagBamWithReadSequenceExtended INPUT=data/test_unaligned.bam OUTPUT=data/test_BC_tagged_unmapped.bam SUMMARY=logs/test_CELL_barcode.txt BASE_RANGE=1-12 BARCODED_READ=1 DISCARD_READ=false BASE_QUALITY=30 NUM_BASES_BELOW_QUALITY=1 TAG_NAME=XC TAG_BARCODED_READ=false HARD_CLIP_BASES=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json [Tue Feb 27 11:17:01 CET 2018] Executing as jens@KI-V0205 on Linux 3.16.0-5-amd64 amd64; OpenJDK 64-Bit Server VM 1.8.0_121-b15; Picard version: 1.13(7bed8f4_1513008033) [Tue Feb 27 11:17:05 CET 2018] org.broadinstitute.dropseqrna.utils.TagBamWithReadSequenceExtended done. Elapsed time: 0.07 minutes. Runtime.totalMemory()=1015021568 Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10 at org.broadinstitute.dropseqrna.utils.TagBamWithReadSequenceExtended.scoreBaseQuality(TagBamWithReadSequenceExtended.java:251) at org.broadinstitute.dropseqrna.utils.TagBamWithReadSequenceExtended.processReadPair(TagBamWithReadSequenceExtended.java:220) at org.broadinstitute.dropseqrna.utils.TagBamWithReadSequenceExtended.doWork(TagBamWithReadSequenceExtended.java:154) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:205) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:94) at org.broadinstitute.dropseqrna.cmdline.DropSeqMain.main(DropSeqMain.java:42)
I assume that my fastq/bam is at some place malformed (e.g. an R1 read is too short), although I can't imagine. Is there any debug flag I can place to get more hints about what fails? Or can you quickly dig into TagBamWithReadSequenceExtended.java at line 251 and tell me whats happening there?
Best and thanks! Jens
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Hoohm/dropSeqPipe/issues/22, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNXaOliRCTsjxPThdXg6mgAqRM0h_HAks5tY9eJgaJpZM4SUr4s .
--
Roelli Patrick Division of Animal Physiology and Immunology TUM School of Life Sciences Weihenstephan Technische Universität München Weihenstephaner Berg 3 85354 Freising GermanyBCF, Swiss Institute of BioinformaticsBâtiment Génopode, Quartier SorgeUniversity of Lausanne1015 LausanneSwitzerland
We identified bcl2fastq
to be causing problems when giving a Nextera adapter sequence to trim away. It will also trim R1 reads, so our R1 read length distribution is highly variable:
1 10
11 13
69 14
9 15
24 16
48 17
111 18
62328 19
187397 20
1 8
Filtering reads shorter than 20 bp before putting it into dropSeqPipe helped. Thanks for your help!
Sure no problem. Quick question, once you filtered out the reads from R1 with less than 20bp, how do you reconcile it with R2?
Hello,I met a block in my analysis of dropSeq data when I used TagBamWithReadSequenceExtended, and my tools is Drop-seq_tools-2.2.0,and my command is "tuyx@ubuntu:~/Drop_Seq$ /home/tuyx/Drop_Seq/Drop-seq_tools-2.2.0/TagBamWithReadSequenceExtended INPUT=/home/tuyx/Drop_Seq/3-somite_Rep_1.ubam OUTPUT=3-somite_Rep_1_unaligned_tagged_Cell.bam SUMMARY=3-somite_Rep_1_unaligned_tagged_Cell.bam_summary.txt BASE_RANGE=1-12 BASE_QUALITY=10 BARCODED_READ=1 DISCARD_READ=False TAG_NAME=XC NUM_BASES_BELOW_QUALITY=1" and my error is" 15:12:52.051 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/tuyx/Drop_Seq/Drop-seq_tools-2.2.0/jar/lib/picard-2.18.14.jar!/com/intel/gkl/native/libgkl_compression.so 15:12:52.084 WARN NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory) 15:12:52.085 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/tuyx/Drop_Seq/Drop-seq_tools-2.2.0/jar/lib/picard-2.18.14.jar!/com/intel/gkl/native/libgkl_compression.so 15:12:52.086 WARN NativeLibraryLoader - Unable to load libgkl_compression.so from native/libgkl_compression.so (No such file or directory) [Tue Jul 02 15:12:52 CST 2019] TagBamWithReadSequenceExtended INPUT=/home/tuyx/Drop_Seq/3-somite_Rep_1.ubam OUTPUT=3-somite_Rep_1_unaligned_tagged_Cell.bam SUMMARY=3-somite_Rep_1_unaligned_tagged_Cell.bam_summary.txt BASE_RANGE=1-12 BARCODED_READ=1 DISCARD_READ=false BASE_QUALITY=10 NUM_BASES_BELOW_QUALITY=1 TAG_NAME=XC TAG_BARCODED_READ=false HARD_CLIP_BASES=false TAG_QUALITY=XQ VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false [Tue Jul 02 15:12:52 CST 2019] Executing as tuyx@ubuntu on Linux 4.4.0-148-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_181-8u181-b13-1ubuntu0.16.04.1-b13; Deflater: Jdk; Inflater: Jdk; Provider GCS is not available; Picard version: 2.2.0(4709ee9_1553016809) [Tue Jul 02 15:12:52 CST 2019] org.broadinstitute.dropseqrna.utils.TagBamWithReadSequenceExtended done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=1011351552 Exception in thread "main" htsjdk.samtools.SAMException: Cannot read non-existent file: file:///home/tuyx/Drop_Seq/3-somite_Rep_1.ubam at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:430) at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:417) at org.broadinstitute.dropseqrna.utils.TagBamWithReadSequenceExtended.doWork(TagBamWithReadSequenceExtended.java:100) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:295) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103) at org.broadinstitute.dropseqrna.cmdline.DropSeqMain.main(DropSeqMain.java:42) ". Can you help me ?
What's a ubam file??
Hi Patrick,
we finally have our first dropSeq data sequenced and I tried to run in through the dropSeqPipe. Unfortunately, TagBamWithReadSequenceExtended terminates after a short while giving:
I assume that my fastq/bam is at some place malformed (e.g. an R1 read is too short), although I can't imagine. Is there any debug flag I can place to get more hints about what fails? Or can you quickly dig into TagBamWithReadSequenceExtended.java at line 251 and tell me whats happening there?
Best and thanks! Jens
Edit: I can share the data/test_unaligned.bam file, if needed!