Closed 5Tony closed 2 years ago
with -B A2G you define a base substitution that will stratify reads into:
See attached Image -> you can call variants on tagged and untagged reads.
Out of curiosity - what are you trying to achieve?
The exception/error/bug is thrown by htsjdk - it says that: "Multiple SAMRecord with read name SRR6112594.981532"
How do you generate the BAM index: samtools index
You could manually process the reads and remove:
with samtools view -F <FLAGS> <input.bam> ...
Check Flags for details.
I want to get the result file after adding the parameter -B A2G, but the program often breaks in running, but it may run normally after excluding the parameter -B A2G, which is distressing me. I've read the manual, but I can't find a suitable solution, so I'm taking the liberty to ask you. Can you provide a sample command to add the -B A2G parameter?
I downloaded GSE104379 from GEO database,
I went through fasterq-dump to fastq, fasterq-dump -e 16 -p --split-files *.sra
Mapped using STAR,
STAR --runThreadN 6 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --alignSJDBoverhangMin 1 --alignSJoverhangMin 8 --alignSoftClipAtReferenceEnds Yes --chimJunctionOverhangMin 15 --chimMainSegmentMultNmax 1 --chimOutType Junctions SeparateSAMold WithinBAM SoftClip --chimSegmentMin 15 --genomeDir /home/data/project/genome/genomeDir_hg38v39_r149 --genomeLoad NoSharedMemory --limitSjdbInsertNsj 1200000 --outFileNamePrefix /home/data/result/STAR/SRR6112594 --outFilterIntronMotifs None --outFilterMatchNminOverLread 0.33 --outFilterMismatchNmax 999 --outFilterMismatchNoverLmax 0.1 --outFilterMultimapNmax 20 --outFilterScoreMinOverLread 0.33 --outFilterType BySJout --outSAMattributes NH HI AS XS nM NM ch --outSAMstrandField intronMotif --outSAMtype BAM Unsorted --outSAMunmapped Within --quantMode TranscriptomeSAM GeneCounts --twopassMode Basic --readFilesIn /home/data/result/trim_galore/SRR6112594/SRR6112594_1_val_1.fq /home/data/result/trim_galore/SRR6112594/SRR6112594_2_val_2.fq
samtools sort -l 0 -@ 25 -m 1G SRR6112594Aligned.out.bam -o SRR6112594Aligned_sorted.out.bam
samtools index -b SRR6112594Aligned_sorted.out.bam
samtools calmd -@10 SRR6112594Aligned_sorted.out.bam GRCh38.primary_assembly.genome.fa > \ SRR6112594Aligned_sorted.out.MD.bam
samtools sort -l 0 -@ 25 -m 1G SRR6112594Aligned_sorted.out.MD.bam -o SRR6112594Aligned_sorted_2.out.MD.bam
samtools index -b SRR6112594Aligned_sorted_2.out.MD.bam
java -jar JACUSA_v2.0.2-RC.jar call-1 -R /home/data/reference/GRCh38.primary_assembly.genome.fa -a I,S,B,M,Y -B A2G -p 10 -q 10 -P UNSTRANDED -r /home/data/t050326/result/test1/test12.txt -s -w 7500 -W 15000 /home/data/result/MD/SRR6112594Aligned_sorted2.out.MD.bam
INFO 00:00:02 Thread 9: Working on contig chr1:3737194-3752193 INFO 00:00:02 Thread 1: Working on contig chr1:3780675-3795674 INFO 00:00:02 Thread 1: Working on contig chr1:3812106-3827105 INFO 00:00:02 Thread 1: Working on contig chr1:3836232-3851231 INFO 00:00:02 Thread 9: Working on contig chr1:4007396-4022395 INFO 00:00:02 Thread 9: Working on contig chr1:4039783-4054782 ERROR 00:00:02 Problem with read: SRR6112594.981532 in /home/data/t050326/result/MD/SRR6112594Aligned_sorted2.out.MD.bam htsjdk.samtools.SAMFormatException: Multiple SAMRecord with read name SRR6112594.981532 for first end. at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.queryMate(SamReader.java:451) at lib.record.Record.getMate(Record.java:60) at lib.data.storage.readsubstitution.BaseSubRecordProcessor.processPE(BaseSubRecordProcessor.java:179) at lib.data.storage.readsubstitution.BaseSubRecordProcessor.process(BaseSubRecordProcessor.java:238) at lib.data.storage.container.UnstrandedCacheContainter.process(UnstrandedCacheContainter.java:50) at lib.data.assembler.SiteDataAssembler.buildCache(SiteDataAssembler.java:56) at lib.util.ReplicateContainer.createIterators(ReplicateContainer.java:49) at lib.util.ReplicateContainer.updateIterators(ReplicateContainer.java:55) at lib.util.ConditionContainer.lambda$1(ConditionContainer.java:34) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) at lib.util.ConditionContainer.updateActiveWinCoord(ConditionContainer.java:34) at lib.worker.AbstractWorker.hasNext(AbstractWorker.java:121) at lib.worker.AbstractWorker.processReady(AbstractWorker.java:196) at lib.worker.AbstractWorker.run(AbstractWorker.java:213) INFO 00:00:02 Thread 9: Working on contig chr1:5786107-5801106 INFO 00:00:02 Thread 1: Working on contig chr1:5883045-5898044
I checked the SRR6112594Aligned_sorted2.out.MD.bam using samtools and saw the following, I am again trying to find more ways to solve this problem, thank you. samtools tview -p chr1:1403914 SRR6112594Aligned_sorted2.out.MD.bam /home/data/t050326/reference/GRCh38.primary_assembly.genome.fa
Hi,I am a beginning bioinformatics student,I took the liberty of coming to ask you a question,I don't understand the use of the -B parameter.
cmd help:
Here is the code I used:
But there is a bug in the operation:
Here's some of the code and java version I used:
I've changed several sets of data and all have similar bugs. I've been searching for a long time and haven't found out how to solve this, so I'm taking the liberty to ask you, and I hope you can solve the doubts in my mind. @Piechotta