LiuLabUB / HMMRATAC

HMMRATAC peak caller for ATAC-seq data
GNU General Public License v3.0
99 stars 23 forks source link

BR: SAM validation error: SAMRecord not found in header #70

Closed saketkc closed 2 years ago

saketkc commented 4 years ago

Describe the bug HMMRATAC's SAM reader fails:

Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 1, Read name A00836:480:HJ2W5DSXY:
2:1125:22037:4241, RG ID on SAMRecord not found in header: 1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:2-59BAA11
        at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
        at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:541)
        at net.sf.samtools.BAMFileReader$BAMFileIndexIterator.<init>(BAMFileReader.java:648)
        at net.sf.samtools.BAMFileReader.createIndexIterator(BAMFileReader.java:598)
        at net.sf.samtools.BAMFileReader.query(BAMFileReader.java:352)
        at net.sf.samtools.SAMFileReader.query(SAMFileReader.java:363)
        at HMMR_ATAC.pullLargeLengths.read(pullLargeLengths.java:112)
        at HMMR_ATAC.pullLargeLengths.<init>(pullLargeLengths.java:61)
        at HMMR_ATAC.Main_HMMR_Driver.main(Main_HMMR_Driver.java:219)

The offending read with RG:

A00836:480:HJ2W5DSXY:2:1125:22037:4241  99      chr12   9999    18      50M     =       10241   291     CGCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC   ::F,FFFFFFF:FFF:FFFF:FFFFFF:,FFFF:FFFFFFF:F:F:F:FF      NM:i:1  MD:Z:0T49       MC:Z:49M        AS:i:49 XS:i:50      XA:Z:chr3,+9999,50M,0;chr7,+9999,50M,1;chr1,-248946198,48M2S,0;chr11,+175542,50M,1;chr11,+175744,39M1I10M,1;    CR:Z:TGGTCCTAGCTTAGCC        CY:Z:FFFFFF::FFFFFFFF   CB:Z:CCATAAATCATCAGTA-1 BC:Z:ACTGGAGC   QT:Z:,F:F,FFF   GP:i:517850464  MP:i:517850755  MQ:i:18      RG:Z:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:2-59BAA11
A00836:480:HJ2W5DSXY:2:1125:22037:4241  147     chr12   10241   18      49M     =       9999    -291    CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTACCCCTAACCC    FF,:F:FFF:FFFFF,::F:,FF::F::F:FF::::FF,,:FFF,,FFF       NM:i:0  MD:Z:49 MC:Z:50M        AS:i:49 XS:i:45 XA:Z:chr1,+248945694,49M,1;chr3,-10527,49M,1;chr7,-10001,49M,1;chr4,-10118,47M2S,1;  CR:Z:TGGTCCTAGCTTAGCC   CY:Z:FFFFFF::FFFFFFFF   CB:Z:CCATAAATCATCAGTA-1      BC:Z:ACTGGAGC   QT:Z:,F:F,FFF   GP:i:517850755  MP:i:517850464  MQ:i:18 RG:Z:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:2-59BAA11

SAM RG header:

@RG ID:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:1    SM:1042128_1044472__OR-100-381  LB:1042128_1044472.1    PU:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:1    PL:ILLUMINA
@RG ID:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:2    SM:1042128_1044472__OR-100-381  LB:1042128_1044472.1    PU:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:2    PL:ILLUMINA
@RG ID:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:3    SM:1042128_1044472__OR-100-381  LB:1042128_1044472.1    PU:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:3    PL:ILLUMINA
@RG ID:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:4    SM:1042128_1044472__OR-100-381  LB:1042128_1044472.1    PU:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:4    PL:ILLUMINA
@PG PN:bwa  ID:bwa  VN:0.7.17-r1188 CL:bwa mem -p -t 4 -M -R @RG\tID:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:1\tSM:1042128_1044472__OR-100-381\tLB:1042128_1044472.1\tPU:1042128_1044472__OR-100-381:1042128_1044472:1:HJ2W5DSXY:1\tPL:ILLUMINA /mnt/test/refdata/cs_references/GRCh38/fasta/genome.fa /mnt/freezer-qa/freezer/cellranger-arc-pd/43e193b7c738a7c657396d1922c3c27e86e258a5/atac_gex_OR-100-381/OR-100-381/SC_ATAC_GEX_COUNTER_CS/SC_ATAC_GEX_COUNTER/_ATAC_MATRIX_COMPUTER/_ALIGNER/TRIM_READS/fork0/chnk0-u8c472ba8f2/files/read1.fastq

System (please complete the following information):

EvanTarbell commented 4 years ago

Quick question, did you use deepTools' alignmentSieve function to shift the reads? If so, i can direct you to issue #47

jitsedesmet commented 2 years ago

@saketkc Are you still interested in finding a solution for your problem? It's possible this issue is fixed after a solution described in #96.