arthurdejong / python-stdnum

A Python library to provide functions to handle, parse and validate standard numbers.
https://arthurdejong.org/python-stdnum/
GNU Lesser General Public License v2.1
495 stars 205 forks source link

Duplicated mod97 operations #289

Closed Majsvaffla closed 2 years ago

Majsvaffla commented 2 years ago

While using the iban module of python-stdnum to convert Swedish national bank account numbers into IBAN, I have found that the conversion is not always correct. Most numbers work fine but some of them has been reported invalid by a Swedish bank. Specifically, the iban module calculated the check digits "01" for a number that is supposed to have "98" as check digits. Both check digits are valid for this specific number according to the algorithm, However, as stated on Wikipedia, only check digits in range "02" to "98" are used.

The current implementation of the ISO 7064 Mod 97, 10 algorithm is incorrect as it performs the modulo operation twice (on different levels). Removing one of the modulo operations solely doesn't fix the problem.

The IBAN algorithm says to append two trailing zeroes before calculating the checksum. I get that the _mod_9710 module does that by multiplying by 100. However, I think it would be better if the iban module passed full (naive) numbers to the _mod_9710 module. The operation could be moved to the checksum function but that breaks the validate function instead.

The test suite in the project where I'm using the iban module passes with these changes. The conversion is tested mostly with real-world examples of Swedish national bank account numbers.

Majsvaffla commented 2 years ago

Sorry for the initially failing checks. I have now amended a fix into the last commit to pass all checks.

arthurdejong commented 2 years ago

Hi @Majsvaffla, thanks for your contribution and clear explanation.

I'd rather not change the behaviour of the calc_check_digit() functions. These functions are described as returning the digits that need to be appended to the number to build up a valid number. Changing this will break third-party code that relies on the current behaviour. For some formats they already support calculating the check digits regardless of whether they were already present or not but that is not possible for the generic (not format specific) algorithms.

Since ISO 7064 is not available publicly I don't know whether check digit values "98" are considered "better" than "01" but in the context of IBAN they most likely are (thanks for the Wikipedia reference).

In e2a2774 I've changed the Mod 97, 10 calc_check_digits() function to return check digits in the 02-98 range instead of the 00-97 range that was used before. If you have example Swedish bank account numbers that were converted incorrectly before I'd be happy to add them to the tests.

Btw, if anyone wants to provide me with an electronic version of ISO 7064 there might be other improvements that could be made. Small rant: While I see the value that standardisation organisations like ISO bring I find it ridiculous that you have to pay more than € 100 for such a document because it makes it much harder for everyone to implement the same standard.

Majsvaffla commented 2 years ago

I agree it seems fair to not change the return behaviour :) Great changes in e2a2774! I'm looking forward to the next release.

This is a made up Swedish bank account number in IBAN format that python-stdnum rendered with 01 as check digits: SE9880000821490000000009

It's not a real-world-example, however, but the first one (counting up from SEXX80000821490000000001 that python-stdnum rendered with check digits 01. It's the only one we kept in out test suite.