arthurdejong / python-stdnum

A Python library to provide functions to handle, parse and validate standard numbers.
https://arthurdejong.org/python-stdnum/
GNU Lesser General Public License v2.1
502 stars 211 forks source link

Handelsregister number length #143

Closed kkaiser closed 5 years ago

kkaiser commented 5 years ago

The minimum length for German trade register numbers should be 1, so the regex should look like:

_number_re = r'(?P<nr>[1-9][0-9]{0,5})(\s*(?P<x>[A-Z]{2}))?'

Furthermore, you can assume that the first number can't be a 0.

I would also adjust the suffix length which can be of length 2 only. I looked through all register numbers and found only the one's with length 2. You can access all data here: https://offeneregister.de/

Here is a count

Counter({'FL': 25271,
         'HU': 4880,
         'SL': 3288,
         'NI': 4299,
         'PI': 22153,
         'EL': 4753,
         'ME': 5285,
         'IZ': 5327,
         'NM': 4887,
         'KI': 36513,
         'NO': 6705,
         'PL': 3391,
         'RD': 5495,
         'HL': 31010,
         'OL': 3159,
         'OD': 2096,
         'RE': 4481,
         'GE': 1392,
         'BS': 2929,
         'RZ': 1160,
         'SB': 2937,
         'AH': 5515,
         'KA': 1425,
         'SE': 3438,
         'EC': 2720,
         'BB': 2227,
         'EU': 2138,
         'MÖ': 1219})

Here are some examples for numbers with length 1:

Let me know if you need additional resources.

arthurdejong commented 5 years ago

There appear to be some numbers going round with just a one-letter suffix, e.g. https://acework.io/imprint/ which is also listed in OffeneRegister.de.

Anyway, I included some fixes in db89d38. Thanks for reporting this.