PapenfussLab / gridss

GRIDSS: the Genomic Rearrangement IDentification Software Suite
Other
258 stars 71 forks source link

Gridss run error: SAM validation error #271

Closed TendoLiu closed 5 years ago

TendoLiu commented 5 years ago

20:22:55.968 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/bgfs/soesterreich/pan_data/soesterreich/Tendo/InstalledTools/gridss/gridss.jar!/com/intel/gkl/native/libgkl_compression.so [Fri Oct 25 20:22:56 UTC 2019] CollectGridssMetrics GRIDSS_PROGRAM=[] THRESHOLD_COVERAGE=50000 INPUT=/bgfs/soesterreich/pan_data/soesterreich/Tendo/Projects/Project2_cfDNA_fusion_capture/SecondPanel/0_Illumina_UMI_Error_Corrected_bam/Sample1.collapsed.bam ASSUME_SORTED=true STOP_AFTER=10000000 OUTPUT=/bgfs/soesterreich/pan_data/soesterreich/Tendo/Projects/Project2_cfDNA_fusion_capture/SecondPanel/1_SVcalling/gridss/tempFile/Sample1.collapsed.bam.gridss.working/tmp.Sample1.collapsed.bam FILE_EXTENSION=null PROGRAM=[CollectInsertSizeMetrics] TMP_DIR=[/bgfs/soesterreich/pan_data/soesterreich/Tendo/Projects/Project2_cfDNA_fusion_capture/SecondPanel/1_SVcalling/gridss/tempFile/Sample1.collapsed.bam.gridss.working] METRIC_ACCUMULATION_LEVEL=[ALL_READS] INCLUDE_UNPAIRED=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false [Fri Oct 25 20:22:56 UTC 2019] Executing as tiantong@n446.htc.sam.pitt.edu on Linux 3.10.0-1062.4.1.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.6.3-gridss [Fri Oct 25 20:22:57 UTC 2019] gridss.analysis.CollectGridssMetrics done. Elapsed time: 0.03 minutes. Runtime.totalMemory()=2043478016 Exception in thread "main" htsjdk.samtools.SAMFormatException: SAM validation error: ERROR: Record 31762, Read name NS500211:808:HW27KAFXY:1:21106:7501:9434:GAACGAG+GACAAC, The unaligned mate start position is 178927892, should be 0 at htsjdk.samtools.SAMUtils.processValidationErrors(SAMUtils.java:446) at htsjdk.samtools.BAMFileReader$BAMFileIterator.decode(BAMFileReader.java:868) at htsjdk.samtools.BAMFileReader$BAMFileIterator.access$1900(BAMFileReader.java:777) at htsjdk.samtools.BAMFileReader$BAMFileIterator$AsyncBamDecoder.transform(BAMFileReader.java:916) at htsjdk.samtools.BAMFileReader$BAMFileIterator$AsyncBamDecoder.transform(BAMFileReader.java:899) at htsjdk.samtools.util.AsyncReadTaskRunner.processNextBatch(AsyncReadTaskRunner.java:228) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

What is the error supposed to mean? "SAM validation error: ERROR: Record 31762, Read name NS500211:808:HW27KAFXY:1:21106:7501:9434:GAACGAG+GACAAC, The unaligned mate start position is 178927892, should be 0"

I used this bam successfully in other SV calling algorisms.

Thanks.

d-cameron commented 5 years ago

Your input file does not conform to the SAM/BAM specifications. Solutions are to fix the input file so it conforms to the specifications (recommended) or add --picardoptions VALIDATION_STRINGENCY=LENIENT to ignore the error. Note that not all errors can be ignored.

d-cameron commented 5 years ago

What is the error supposed to mean?

The record has RNEXT of *, and PNEXT of 178927892. Whilst nonsensical, this technically doesn't actually violate the SAM specs as section 1.4.7 of the SAM specs states If RNEXT is ‘*’, no assumptions can be made on PNEXT and bit 0x20

Please raise this issue with htsjdk (https://github.com/samtools/htsjdk) as that is the SAM parsing library that GRIDSS uses and that is where the error is originating from.