arthurdejong / python-stdnum

A Python library to provide functions to handle, parse and validate standard numbers.
https://arthurdejong.org/python-stdnum/
GNU Lesser General Public License v2.1
484 stars 203 forks source link

Add European Community (EC) Number #422

Closed weberdak closed 7 months ago

weberdak commented 7 months ago

Added support for European Community (EC) Numbers used to identify chemical substances in some EU regulatory lists. This number uses a checksum validation method very similar to the CAS Registry Number, hence the code follows the same format and structure as the stdnum.casrn module.

A complete test suite is included in the test folder and successfully executed with: doctest.testfile("test_eu_ecnumber.doctest", optionflags=doctest.IGNORE_EXCEPTION_DETAIL)

arthurdejong commented 7 months ago

Hi @weberdak,

Thanks for providing this implementation. I have a question regarding the use of dashes in the number.

In python-stdnum there is generally a compact() function that would return a minimal representation of the number and a format() function that returns a format usually meant for presentation to users. I would expect the compact() function to strip the dashes and format() to introduce them.

Would it be incorrect to have the compact() and validate() functions return "2000018" instead of "200-001-8"?

weberdak commented 7 months ago

Hello @arthurdejong,

I'm not quite sure if I understand correctly, but I believe it will be incorrect to have the validate() function remove hyphens. I have not seen these values used without hyphens, which I think is to prevent them from being mistaken for CAS RNs. E.g., XXX-YYY-C compared to XXXXXXX-YY-C.

I can see your point, however, with the compact() function. Logically I would think a format() function would be more appropriate. My choice here was mostly to be consistent with the stdnum/casrn.py module. Note that this module also has the validate() function reintroduce hyphens, which is nice behavior in my experience.

arthurdejong commented 7 months ago

Hi @weberdak,

I've merged your PR as 2535bbf. I've kept it that the compact() and validate() functions return a version of the number including dashes.

Thanks!