wfondrie / depthcharge

A deep learning toolkit for mass spectrometry
https://wfondrie.github.io/depthcharge/
Apache License 2.0
59 stars 18 forks source link

Add option to dekonize peptide into AA list #22

Closed melihyilmaz closed 1 year ago

melihyilmaz commented 1 year ago

Added option for detokenize to return peptide sequence as a list of amino acids rather than only in the string format. This was needed in the context of Casanovo for cases where multiple N-terminal modifications are predicted back to back (e.g. -17.027+42.011) and the downstream regex logic fails to differentiate these PTMs. Refer to below discussion for more details: https://github.com/Noble-Lab/casanovo/pull/87#discussion_r1023220461

melihyilmaz commented 1 year ago

@wfondrie Should I add a unit test for detokenize or is failure due to something else?

wfondrie commented 1 year ago

The above previous errors were due to a failure to upload test coverage to CodeCov, which hopefully I've fixed now 🤞

Should I add a unit test for ...

The answer to this question is always yes 😉

codecov-commenter commented 1 year ago

Codecov Report

Merging #22 (16b9098) into main (23683c0) will not change coverage. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main      #22   +/-   ##
=======================================
  Coverage   87.00%   87.00%           
=======================================
  Files          19       19           
  Lines         854      854           
=======================================
  Hits          743      743           
  Misses        111      111           
Impacted Files Coverage Δ
depthcharge/components/transformers.py 98.16% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

wfondrie commented 1 year ago

Is there a good reason to every return a string? It seems like we should just make it always return a list. It seems easy enough for the user to join the string if they desire. 🤔

I'd ok with the breaking change for now, since I think Casanovo is currently the only dependent. What do you think @melihyilmaz and @bittremieux?

bittremieux commented 1 year ago

I concur.

melihyilmaz commented 1 year ago

Makes sense we can only return a list, I added it as an option earlier to avoid breaking things down.