zstephens / neat-genreads

NEAT read simulation tools
Other
95 stars 27 forks source link

Error indexing golden.bam #40

Closed afzm closed 6 years ago

afzm commented 6 years ago

When I try to build the index of the golden.bam I encounter an error:

java -jar $PICARD BuildBamIndex I=golden.bam O=golden.bam.bai
Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing SAM header. @RG line missing SM tag. Line:
@RG ID:NEAT; File /home/sequentia1/Alvaro/todo/ggg/masked_2n_control_golden.bam; Line number 9
    at htsjdk.samtools.SAMTextHeaderCodec.reportErrorParsingLine(SAMTextHeaderCodec.java:265)
    at htsjdk.samtools.SAMTextHeaderCodec.access$200(SAMTextHeaderCodec.java:43)
    at htsjdk.samtools.SAMTextHeaderCodec$ParsedHeaderLine.requireTag(SAMTextHeaderCodec.java:346)
    at htsjdk.samtools.SAMTextHeaderCodec.parseRGLine(SAMTextHeaderCodec.java:175)
    at htsjdk.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:108)
    at htsjdk.samtools.BAMFileReader.readHeader(BAMFileReader.java:667)
    at htsjdk.samtools.BAMFileReader.<init>(BAMFileReader.java:298)
    at htsjdk.samtools.BAMFileReader.<init>(BAMFileReader.java:176)
    at htsjdk.samtools.SamReaderFactory$SamReaderFactoryImpl.open(SamReaderFactory.java:396)
    at htsjdk.samtools.SamReaderFactory$SamReaderFactoryImpl.open(SamReaderFactory.java:208)
    at picard.sam.BuildBamIndex.doWork(BuildBamIndex.java:137)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:282)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:98)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:108)

When I try to correct it by AddOrReplaceReadGroups, there is another error:

java -jar $PICARD AddOrReplaceReadGroups I=golden.bam O=golden_RG.bam RGID=4 RGLB=lib1 RGPL=illumina RGPU=unit1 RGSM=20 

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing SAM header. @RG line missing SM tag. Line:
@RG ID:NEAT; File /home/sequentia1/Alvaro/todo/ggg/masked_2n_control_golden.bam; Line number 9
    at htsjdk.samtools.SAMTextHeaderCodec.reportErrorParsingLine(SAMTextHeaderCodec.java:265)
    at htsjdk.samtools.SAMTextHeaderCodec.access$200(SAMTextHeaderCodec.java:43)
    at htsjdk.samtools.SAMTextHeaderCodec$ParsedHeaderLine.requireTag(SAMTextHeaderCodec.java:346)
    at htsjdk.samtools.SAMTextHeaderCodec.parseRGLine(SAMTextHeaderCodec.java:175)
    at htsjdk.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:108)
    at htsjdk.samtools.BAMFileReader.readHeader(BAMFileReader.java:667)
    at htsjdk.samtools.BAMFileReader.<init>(BAMFileReader.java:298)
    at htsjdk.samtools.BAMFileReader.<init>(BAMFileReader.java:176)
    at htsjdk.samtools.SamReaderFactory$SamReaderFactoryImpl.open(SamReaderFactory.java:396)
    at picard.sam.AddOrReplaceReadGroups.doWork(AddOrReplaceReadGroups.java:152)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:282)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:98)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:108)
zstephens commented 6 years ago

Greetings, I added the missing fields to the golden bam header output, and I verified that picard was able to build the index for some test data. Hopefully this is now fixed.

Cheers.