Arg0s1080 / mrz

Machine Readable Zone generator and checker for official travel documents sizes 1, 2, 3, MRVA and MRVB (Passports, Visas, national id cards and other travel documents)
GNU General Public License v3.0
328 stars 122 forks source link

Two letter country code #32

Open GrofGraf opened 3 years ago

GrofGraf commented 3 years ago

I believe, there is an error in TD1CodeChecker, atlest for some IDs.

Two letter country codes and nationality codes do not pass the check, while three letter country codes do.

There should also be a possibility of two letter country codes added to check, because some IDs have two letter country and nationality code.

Arg0s1080 commented 3 years ago

Hi!

I don't know if I understand you..

IIf I remember correctly, all countries codes have 3 letters with two exceptions: United Kingdom "GB" and Germany "D" and both work correctly

For example:

from mrz.checker.td1 import TD1CodeChecker

print(TD1CodeChecker("ID<GB0000000000000000000<<<<<<\n"
                     "8001014F2501017<GB<<<<<<<<<<<4\n"
                     "SAMPLE<SAMPLE<<SAMPLE<SAMPLE<<"))

print(TD1CodeChecker("ID<<D0000000000000000000<<<<<<\n"
                     "8001014F2501017<<D<<<<<<<<<<<4\n"
                     "SAMPLE<SAMPLE<<SAMPLE<SAMPLE<<"))

Output

True
True

Can you give an example?

Thanks

GrofGraf commented 3 years ago

Thank you for a quick response. I get an error with Slovenian ID, that has a two letter country code SI. Slovenian passports work as expected as they have three letter country code SVN.

Slovenian ID

Arg0s1080 commented 3 years ago

Sorry.. I read your message days ago but i forgot to reply to you.

I was planning to give you the standard answer. "ICAO specifications says": All countries use 3-letter codes, so "this is outside the scope of the project. Therefore it's a special case" and then I would quote specs.

But reviewing the specs i read:

The following are the two- and three-letter codes for entities specified and regularly updated in [ISO 3166-1], with extensions for certain States and organizations being identified by an asterisk. The current version of the codes may be obtained from the [ISO 3166] maintenance agency - [ISO 3166/MA], ISO’s focal point for country codes.

Specs are specs, so it's within the scope of this project.

The good news is that I envisioned something for something similar to your request:

def country(string, dictionary=countries.english):
    if check_string(string) and string.upper() in dictionary.values():
        return string.upper().ljust(3, "<")
    elif full_capitalize(string) in dictionary.keys():
        return dictionary[full_capitalize(string)].ljust(3, "<")
    else:
        raise CountryError(cause=string)

(See dictionary kwarg)

Bad news is that I now have very little free time. However, all this it's pending.

As this is something that requires a lot of changes, I promise to provide a temporary special case in a few days (a special case for Slovenian TD1s)

BR

GrofGraf commented 3 years ago

Thank you for the response, Take your time when you can, I will make a workaround based on your response in the meanwhile.

Thanks again for supporting this project and keep up the good work.

Best regards.

mjl commented 3 years ago

I just stumbled over the same issue (on the checker side).

The quick and dirty fix is to add

"Slovenia2": "SI",

to base/countries.py.

However, this might be a bit of a moving target, so I'd suggest to ultimatively not flag unrecognised countries as errors at all but to just pass them through. Or perhaps add a verify_country=True to the checker to disable the country check if you don't really need it?

mjl commented 3 years ago

The even easier quick hack is to add

# Patch up MRZ module for special cases
from mrz.base.countries import english

english["Slovenia2"] = "SI"

to you application, that way one doesn't have to touch the original mrz distribution files.