aineniamh / snipit

snipit: summarise snps relative to your reference sequence
GNU General Public License v3.0
141 stars 28 forks source link

Support for amino acids, correct ambiguous handling #33

Closed ammaraziz closed 5 months ago

ammaraziz commented 7 months ago

Hi @aineniamh

This pull request adds support for plotting amino acid changes. In addition, I've corrected what I think was a bug in snipit not plotting ambiguous bases by default.

Finally, added a color scheme ugene for proteins - based on the color scheme from UGENE GUI software (OSS).

First pull request, let me know if you want anything changed.

Thanks!

aineniamh commented 7 months ago

Hi @ammaraziz, this is fantastic- thanks a million! AA support has been something on my to-do list for a long time, but just hadn't gotten around to it!

The only thing I'd say is the behaviour originally for snipit was intended to only plot sites with ambiguous bases if there was a non-ambiguous SNP at that site in another sequence. This was because originally written for SC2, if amplicon dropouts were in any sequence (which is common), you'd get long tracts of Ns in the snipit plot but at ortherwise consistent sites.

I think the new behaviour would allow all sites, even just an N to add a column to the plot right? Perhaps this should be the behaviour, or else a configurable option to toggle that on or off. I'd be interested to hear your thoughts on this!

ammaraziz commented 7 months ago

@aineniamh Thank you for your kind words.

The only thing I'd say is the behaviour originally for snipit was intended to only plot sites with ambiguous bases if there was a non-ambiguous SNP at that site in another sequence. This was because originally written for SC2, if amplicon dropouts were in any sequence (which is common), you'd get long tracts of Ns in the snipit plot but at ortherwise consistent sites.

I think the new behaviour would allow all sites, even just an N to add a column to the plot right? Perhaps this should be the behaviour, or else a configurable option to toggle that on or off. I'd be interested to hear your thoughts on this!

The code now makes more sense, I was confused by it's purpose. That's correct, the new behavior will now show all N. I agree the functionality shouldn't change in such a huge way.

Give me a few days and I will submit a patch to this pull request.

What do you think about including a new flag --ambig-mode with three options:

let me know!

aineniamh commented 7 months ago

That sounds perfect yeah! I'd be happy to merge that in if you have the time to do it!

ammaraziz commented 5 months ago

@aineniamh I finally got time to make the changes. I also added another color scheme (classic_extended) to handle NT ambig bases.

Let me know what you think.

The readme will need to be updated, which I'm happy to do another pull request.

aineniamh commented 5 months ago

This is fantastic @ammaraziz, thanks for much for your contributions! I'll check out the changes today and merge in!

aineniamh commented 5 months ago

Huh, I've merged in the commits, but there is one commit you made that reverts everything to original. Is this intentional? 😆 I've checked out commit ebba22d and am playing around with this, as it's the last commit prior to the reversion!

aineniamh commented 5 months ago

I like the "snipping complete" print out 😁

ammaraziz commented 5 months ago

Ooops! Sorry for the confusion, I must have pushed that revision to the wrong branch.

Thanks for accepting! Happy to see it go live <3