lindenb / jvarkit

Java utilities for Bioinformatics
https://jvarkit.readthedocs.io/
Other
487 stars 133 forks source link

tview / ttview equivalent for protein-peptides bam #188

Open avilella opened 3 years ago

avilella commented 3 years ago

Subject of the issue

I have some peptides from a proteomics experiment (~5000) and I have a list of protein references that they may align to, sometimes not perfectly. I've used ncbi-blast SAM output option and a bit of datamunging to create a SAM/BAM file output of this with the use of samtools. The contents of the BAM file look good, but when I display them in samtools tview, the aminoacids in each SAM record are converted into A,C,G,T, plus ambiguity codes.

I then remembered jvarkit had a bam ttview module that produced a similar, in some aspects more rich output, than samtools tview, with some more flexibility.

Expected behaviour

Should I expect this to work? E.g. same basic behaviour as with DNA alphabet reference and NGS input, but with a protein alphabet and peptide input.

If this is beyond what jvarkit would be interested in supporting, that's perfectly understandable. There may be a non-SAM based approach to do that which is a better alternative than shoehorning proteins into the SAM/BAM toolkits out there.

Thanks in advance,

lindenb commented 3 years ago

E.g. same basic behaviour as with DNA alphabet reference and NGS input, but with a protein alphabet and peptide input.

Hum I doubt it will work. jvarkit uses the htsjdk lib which is quite stringent with BAM format.