Villen-Lab / pyAscore

A python package for fast post translational modification localization, powered by Cython.
https://pyascore.readthedocs.io/
MIT License
18 stars 5 forks source link

Extract ascores for all potential PTM sites #4

Closed RalfG closed 2 years ago

RalfG commented 2 years ago

Thanks for this modern implementation of Ascore!

I would like to get scores for all potential PTM sites, but only the best score seems to be reported, even though the alternative sites are listed in the output. For instance:

Scan LocalizedSequence PepScore Ascores AltSites
3767 ALLSLRS[80]HK 23.64276885986328 12.159486 4

Additionally, when more than one alternate sites are present, the indices seem to repeat the first alternate site, instead of listing both alternate sites correctly:

Scan LocalizedSequence PepScore Ascores AltSites
4190 ALLSLHS[80]SK 35.06369400024414 7.7721076 4,4

Am I missing an option to report all scores, or could a simple change to the (Python) code allow me to parse all scores?

AnthonyOfSeattle commented 2 years ago

Hey! No problem.

I didn't originally set up the code to report the localization scores for all potential sites since the number of usable site determining peaks tends to drop off pretty quick in my experience. As a lab, we usually only focus on the best localization and then use that to filter for the most confident anyways. Making a user friendly option to get Ascores for all alternatives is likely a bit out of scope for now, but I can check the code real quick to see if getting the PepScore for all localizations is available. If you had that, you could walk down the list in Python and easily get the Ascores for any localization you want using the scoring methods available in the Python interface. I will report back in a day or so for you.

On the AltSites thing. This bug was pointed out to me by a lab member last week and you got me right before fixing it. Hope to have that patched up this week too.

Sorry for the delay, but thanks for the feedback!

RalfG commented 2 years ago

Thanks for the quick reply!

I definitely agree. From a user friendly perspective the current approach makes perfect sense. This is also the behavior I have seen in some other localization tools. However, if I could get Ascores for all sites through the Python interface, even if the code needs some small modifications, that would be great!

Looking forward to your reply.

AnthonyOfSeattle commented 2 years ago

Ok, looks like I don't expose the methods that would help you right now. I am happy to link the Python code to the relevant C++ methods, though. I will open a couple of PRs to address this.

First I will go ahead and fix the alternative site problem since that looked relatively easy.

AnthonyOfSeattle commented 2 years ago

Alright, this should do it. After running the initial score method (which is necessary to calculate all the initial information), all the internal PepScore containers can be accessed with the pep_scores attribute. You can iterate through this, decide which permutations of the modification of interest should be compared, and then pass the containers to the calculate_ambiguity function which compares the permutations based on site determining peaks (i.e. the ascore).

Hope this helps!

RalfG commented 2 years ago

Thank you so much for implementing this! I'll try it out tomorrow.

RalfG commented 2 years ago

Hi @AnthonyOfSeattle, I could successfully use the methods you have implemented! I do have three more questions:

AnthonyOfSeattle commented 2 years ago

Glad to hear it!

I will answer in turn: