Closed rjsicko closed 7 years ago
Mismatch in reference sequence names is a common scenario. For example, the manifest file mentioned chr1 but your BAM file mentioned 1 for chromosome 1. I just added a check in clipprimer.pl to compare those in SAM SQ header lines versus BEDPE: ea387ba. A warning will be given in your case and please help to try.
For manifest2bedpe.pl, I guess that the current version can properly parse and handle target names containing spaces and '+'. Could you provide any sample manifest lines of concern?
you're right manifest2bedpe.pl parsed my target names with spaces and '+' fine. I just wasn't sure if the spaces and '+' in the name field of the bedpe file was causing issues with clipprimer.pl (it wasn't). Thanks again.
Thanks for the script to convert a TruSeq manifest to bedpe file. I can confirm the script and bamclipper worked with my custom TruSeq files.
I did have to remove chr from the bedpe file as I aligned using GRCh37 instead of hg19. I initially didn't catch the chr vs no chr issue and bamclipper ran and output a "clipped" bam file, but the primers weren't clipped. I suggest error checking that the supplied primer file and the aligned bam both use chr or don't. Or internally harmonizing chr vs no chr in bamclipper.
While I was debugging when my primers weren't clipped, I noticed my manifest file has spaces and '+' in it. I initially thought this might be an issue so I modified 'manifest2bedpe.pl' by adding
in the conversion for loop.
Thanks again for the program!