arthurdejong / python-stdnum

A Python library to provide functions to handle, parse and validate standard numbers.
https://arthurdejong.org/python-stdnum/
GNU Lesser General Public License v2.1
495 stars 205 forks source link

Unverified Validation Rules for German IDNR #286

Closed myleskilos closed 2 years ago

myleskilos commented 2 years ago

Hello! Thank you for such a complete package of number format validators! In reviewing the code for German IDNRs, I was looking at the validate function and noticed on lines 71-79 there is logic requiring exactly one number repeating 2 or 3 times in the IDNR.

I checked OECD documentation and the links in the comments at the top of the file on lines 31 and 32, but could not find any additional supporting documentation that outlines this as a formatting rule.

https://github.com/arthurdejong/python-stdnum/blob/dcf47300aa522ad4a76c9062e10f7ef4e627e204/stdnum/de/idnr.py#L71-L78

arthurdejong commented 2 years ago

Thanks for pointing this out. The implementation was based on https://github.com/holvi/python-stdnum/pull/12 and the current Wikipedia page still seems to list those requirements (though my German is pretty bad so I might me misunderstanding something).

That Wikipedia page leads to Informationen zur Berechnung gültiger Prüfziffern which contains (in section 1, point 5):

Unter den Positionen 1 bis 10 müssen entweder genau zwei Ziffern oder genau drei Ziffern gleich sein.

Which is, I think, exactly what was implemented.

Interestingly there is also (point 6):

Existieren drei gleiche Ziffern an den Positionen 1 bis 10, dürfen diese gleichen Ziffern niemals an direkt aufeinander folgenden Stellen stehen.

which has not been implemented.

Note however that the document appears to be a concept version and I couldn't quickly find a final version.

If you can provide valid numbers that are incorrectly marked as invalid I'd be happy to fix the implementation,

myleskilos commented 2 years ago

Thanks for the additional research / checking. Looks like it's implemented correctly, other than point 6, which you mentioned.

Will close this out.