NBISweden / AGAT

Another Gtf/Gff Analysis Toolkit
GNU General Public License v3.0
465 stars 56 forks source link

Parsing of the X CIGAR code in bam files #422

Closed sanyalab closed 9 months ago

sanyalab commented 9 months ago

Hello,

Can agat_convert_minimap2_bam2gff.pl be modified to process the X CIGAR code in some BAM files? I have aligned IsoSeq data to genome with minimap and I have a bam file with the =/X CIGAR code. After conversion to GFF I cannot see the percent identity and coverage of the IsoSeq mRNA on the genome, like I do for a GMAP output.

Can the code be modified to handle the X CIGAR class?

Thanks Abhijit

Juke34 commented 9 months ago

Sure, but I would need a detailed example to be sure to implement it properly

sanyalab commented 9 months ago

Hi Jacques,

Thank you for replying and agreeing to implement. I am working on a different project for now, so closing this request temporarily. Will re-open once I have data, examples and objectives to share. Thanks

-Abhijit

jdmontenegro commented 3 months ago

Hi @Juke34, I'd like to reopen this thread because I just ran into the same error. Very quickly, I generated bam files of transcripts mapped to a reference genome with minimap2. However some of the CIGAR descriptions contain an "X". I found here: (https://www.drive5.com/usearch/manual/cigar.html) that it represents a mismatch between the query and the reference, but currently the script cannot parse it. Any suggestion on how to work around it, would be more than welcome.