topdownproteomics / sdk

Software solution for common top-down proteomics tasks
http://www.topdownproteomics.org/
MIT License
9 stars 4 forks source link

Handle 4 best practice proforma sequences #33

Closed rfellers closed 6 years ago

rfellers commented 6 years ago

Adds some unit tests and relaxes the requirements for tags (from enum to string). I'm wanting feedback here: my plan is to relax the key requirements in this step and handle the higher level logic during validation and conversion. For example, a tag of [fake:123] would pass this stage and then throw an exception during a conversion to a proteoform. There was also the practical issue of PSI-MOD not working as an enum value. There is also a unit test called HandleExtraTagSpaces() that explicitly tests for extra space handling; I'm curious to see what people think. This PR addresses #22 and #27

acesnik commented 6 years ago

I like how you're relaxing the parsing, waiting until later for stricter validation. The space handling looks good (ditching spaces before the key, keeping all the rest). I might recommend storing the whitespace following the value in a separate property, so that it can be reproduced, but so that the user doesn't need to stick .TrimRight() before using the value string every time.

I'll take a look at those best practice tests tomorrow afternoon.