wfondrie / mokapot

Fast and flexible semi-supervised learning for peptide detection in Python
https://mokapot.readthedocs.io
Apache License 2.0
43 stars 15 forks source link

More informative error when few proteins map #29

Closed jspaezp closed 3 years ago

jspaezp commented 3 years ago

Hello there,

First of all nice work! I love mokapot so far.

I would like to know if it is possible to modify the program so a more informative error is provided when the peptides cannot be matched to proteins, since currently the ValueError is raised instead of the logged message where the number and some of the identities of the peptides are shown.

I think a good starting point would be to log the warning before the ValueError is thrown or have the error message include the exact percentage of peptides mapped and some of their sequences.

https://github.com/wfondrie/mokapot/blob/1eef5f34165656e9a141adb0815111c47e9aee26/mokapot/picked_protein.py#L89-L102

How it looks right now:

[INFO]  - Found 927355 PSMs from unique spectra.
[INFO]  - Found 47800 unique peptides.
Traceback (most recent call last):
...
...
...
ValueError: Fewer than 90% of all peptides could be matched to proteins. Verify that

let me know what you think/if you want me to write a PR Kindest wishes, Sebastian

wfondrie commented 3 years ago

Hi Sebastian,

I'm glad that you've been enjoying mokapot so far. I agree that providing more information to this error would be an excellent addition. If you want to submit a PR, I'd happily review it when you're ready. Otherwise, I'll add it in the near future.

Best, Will