Open zhangyafeng1 opened 2 years ago
This is the case in BAM
This is my code , I'm not sure what caused the error,,Thanks for your help !!
Step1: /usr/bin/java fgbio-1.3.0.jar GroupReadsByUmi \ -i merge.extra_umi.bam \ -o group.bam \ -t RX -T MI -m 20 --strategy=paired -e 1 \ -f familysize.txt
Step2: /usr/bin/java -Xmx4g -jar fgbio-1.3.0.jar SortBam \ -i group.bam \ -s TemplateCoordinate \ -o group.sorted.bam
Step3: /usr/bin/java -Xmx4g -jar fgbio-1.3.0.jar CallMolecularConsensusReads \ -i group.sorted.bam \ --error-rate-pre-umi=45 --error-rate-post-umi=30 --min-input-base-quality=30 --min-consensus-base-quality=30\ --min-reads=1 \ -o unmapped.consensus.bam
Step4: /usr/bin/java -Xmx4g -jar picard.jar SamToFastq \ I=unmapped.consensus.bam \ F=consensus.1.fq \ F2=consensus.2.fq
Hi @zhangyafeng1
I'm using a very similar workflow that you show here and I get the same error. Indeed I also find more than two reads having the same name in the output bam file from GroupReadsByUmi
. Have you already found a solution?
Thanks in advance your help!
Could you share the input BAM to GroupReadsByUmi for us to reproduce and debug? If it’s too large, try picking the first hundred or so reads and verify that you get the same error. Thank you!
@zhangyafeng1 sorry this slipped through the cracks and @Gregory94 thanks for any data you could provide to help debug!
Hi @nh13, thanks for your quick reply! I think I already found the solution. CallMolecularConsensusReads
requires the bam files to be sorted using SortBam -s TemplateCoordinate
(like @zhangyafeng1 shows in the third comment and also as is recommended by CallMolecularConsensusReads
when you call this function). But CallMolecularConsensusReads
does not, by default, consider the reads sorted by TemplateCoordinate. This you have to explicitly specify using CallMolecularConsensusReads -S TemplateCoordinate
. I thought this was the default setting because it is recommended by the tool, hence my confusion. For me the problem is now solved, I hope this also works for @zhangyafeng1
@Gregory94 in our next release, we're going to be more explicit about the sorts going in and out, so this is good timing.
@zhangyafeng1 can you send a test case?
I am running into similar issue. I am trying to run mitochondrial variant calling step given below:
picard SamToFastq INPUT=sample1_rev_sorted.bam FASTQ=/dev/stdout INTERLEAVE=true NON_PF=true VALIDATION_STRINGENCY=LENIENT | bwa mem -M -K 100000000 -p -v 3 -t 2 -Y resources/Homo_sapiens_assembly38.chrM.fasta - > sample1_temp.bam
Sorting sam
picard SamToFastq INPUT=sample1_rev_sorted.bam FASTQ=/dev/stdout INTERLEAVE=true NON_PF=true VALIDATION_STRINGENCY=LENIENT | bwa mem -M -K 100000000 -p -v 3 -t 2 -Y resources/Homo_sapiens_assembly38.chrM.fasta - > sample1_temp.bam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
INFO 2023-02-09 18:55:54 SamToFastq
********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
**********
https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
********** SamToFastq -INPUT sample1_rev_sorted.bam -FASTQ /dev/stdout -INTERLEAVE true -NON_PF true -VALIDATION_STRINGENCY LENIENT
**********
18:55:55.286 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/rohit/miniconda3/envs/cancerpipe/share/picard-2.27.4-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Thu Feb 09 18:55:55 IST 2023] SamToFastq INPUT=sample1_rev_sorted.bam FASTQ=/dev/stdout INTERLEAVE=true INCLUDE_NON_PF_READS=true VALIDATION_STRINGENCY=LENIENT OUTPUT_PER_RG=false COMPRESS_OUTPUTS_PER_RG=false RG_TAG=PU RE_REVERSE=true CLIPPING_MIN_LENGTH=0 READ1_TRIM=0 READ2_TRIM=0 INCLUDE_NON_PRIMARY_ALIGNMENTS=false VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Thu Feb 09 18:55:55 IST 2023] Executing as rohit@DESKTOP-27J4F4O on Linux 5.15.79.1-microsoft-standard-WSL2 amd64; OpenJDK 64-Bit Server VM 1.8.0_332-b09; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.27.4-SNAPSHOT
[Thu Feb 09 18:55:55 IST 2023] picard.sam.SamToFastq done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=514850816
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" picard.PicardException: Illegal mate state: M01368:16:000000000-A3M99:1:1112:14644:9150
at picard.sam.SamToFastq.assertPairedMates(SamToFastq.java:438)
at picard.sam.SamToFastq.handleRecord(SamToFastq.java:309)
at picard.sam.SamToFastq.doWork(SamToFastq.java:205)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:308)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
[M::process] read 2361 sequences (528144 bp)...
[M::process] 1 single-end sequences; 2360 paired-end sequences
[M::mem_process_seqs] Processed 1 reads in 0.001 CPU sec, 0.002 real sec
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 1180, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (205, 277, 369)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 697)
[M::mem_pestat] mean and std.dev: (291.13, 122.19)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 861)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 2360 reads in 0.349 CPU sec, 0.188 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -M -K 100000000 -p -v 3 -t 2 -Y resources/Homo_sapiens_assembly38.chrM.fasta -
[main] Real time: 2.730 sec; CPU: 0.454 sec
@Rohit-Satyam can you please open a new issue and provide how you sorted it and a test case?
hello: After I finish CallMolecularConsensusReads, when I use SamToFastq, the following error appears, indicating that there is an illegal mate state: A:10000006/A, could you tell me the reason? thank you very much !
Exception in thread "main" picard.PicardException: Illegal mate state: A:10000006/A at picard.sam.SamToFastq.assertPairedMates(SamToFastq.java:376) at picard.sam.SamToFastq.doWork(SamToFastq.java:184) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)