Open fburic opened 4 years ago
If you provide a PR, we are happy to merge it. Bonuspoints if you have a working test case :)
Sorry to chip in. Originally, we designed the Prosit pipeline, so that it never changes any user input. We on purpose didn't skip sequences, because input and ouput files would have a different length. I believe that providing malformed input data should throw an error and not a warning. But maybe this premise changed in the Prosit pipeline.
@tkschmidt @LLautenbacher @fburic
I see, that makes sense. If the premise hasn't changed, then I'd only suggest making the error more explicit. In my case, it took me quite some time to figure out what was wrong (and I'm familiar with Python). Then I just decided it was more efficient (and acceptable) for me to fix it like above :)
From a usability perspective, I would suggest that sequences with unsupported amino acids be skipped and warnings be issued. The current behavior simply has the program crash with a KeyError. The note on the main page about these amino acids is clear but having the user sanitize their input seems like much more effort than handling this through Prosit.
Here is my quick and dirty patch (just to get the tool running for me). Probably not the best way to handle this but it does rely on
utils.peptide_parser()
to not duplicate logic.PS Good work on developing Prosit! :smile: