Closed jorvis closed 10 years ago
My suggestion is to do 1.2.1 or give the option to do that, rather than automatically discard. It is less granular, but still valid e.c.
I checked into the code here, and the constructor method to create a bioannotation.ECAnnotation object already did have a check for the format of the passed EC number:
re_pattern = re.compile('(((([0-9\-]+)\.[0-9\-]+)\.[0-9\-]+)\.[a-z0-9\-]+)')
In that regex I noticed that I was specifically already allowing alpha-characters in the last position, but didn't remember why. So I looked into the release notes at ExPASy and found this:
ENZYME now includes entries with preliminary EC numbers. Preliminary EC numbers include an 'n' as part of the fourth (serial) digit (e.g. EC 3.5.1.n3).
Therefore, the current implementation is correct, and no changes need to be made.
There are sources in public HMM and BLAST libraries which assert EC numbers that are malformed, such as "1.2.1.n2". Due to the nature of how these are used, I think the proper thing to do is to warn when the user attempts to add a malformed EC number but don't throw an exception.