arthurdejong / python-stdnum

A Python library to provide functions to handle, parse and validate standard numbers.
https://arthurdejong.org/python-stdnum/
GNU Lesser General Public License v2.1
498 stars 206 forks source link

Check for the Formatting of the Number as well #244

Closed Rajmehta123 closed 1 year ago

Rajmehta123 commented 3 years ago

This is the best library for finding the potential numbers. But I have a question/feature that could enhance the library.

Question: Can the module also support verifying the formatting in a strict manner?

Solution/Example: Assuming I am finding the German Steuernummer Number, it has to follow the Standard Format. For eg German Steuernummer has the formatting as FFF/UUU/BBB P or FFFUUUBBBP. But it will never have other punctuation such as .-_.

For eg: 918/082/00356 is the number for Bayern. This could either be 918/082/00356 or 91808200356 but not 918-082-00356. This means strict formatting.

When I use the module as follows: 
mod = get_cc_module('de','stnr')
mod.validate('91-808200356')
It returns '91808200356' (returning True validation). 

That means it is cleaning all punctuations hence removing the formatting of the number before validating. Can the formatting be preserved before validating and validate only if the formatting condition is met?

Thank you. Appreciate any efforts.

arthurdejong commented 3 years ago

Hi @Rajmehta123,

The German Steuernummer implementation currently strips all kind of separators.It does seem that the / is the only one (apart from the space) that is actually used much. The other separators were added in the original pull request. The other separators can probably be removed but I don't know the exact background of why they were added.

The python-stdnum library also doesn't generally check validity of the provided formatting mostly because for most modules the positions of the separators is inconsistent at best.

Some modules also provide a format() function that can be used to inset separators in the right positions, e.g.

>>> from stdnum import get_cc_module
>>> mod = get_cc_module('de', 'stnr')
>>> mod.validate('91-808200356')
'91808200356'
>>> mod.format('91808200356')
'918/082/00356'

The idea is that the validate function returns a format that is appropriate for storing the number (e.g. in a database). The format() function is meant for presentation purposes.

arthurdejong commented 1 year ago

Closing this due to lack of activity. There are currently no plans to implement more strict format validation across the supported numbers (I really like Postel's Law).