Arg0s1080 / mrz

Machine Readable Zone generator and checker for official travel documents sizes 1, 2, 3, MRVA and MRVB (Passports, Visas, national id cards and other travel documents)
GNU General Public License v3.0
322 stars 120 forks source link

Bug in parsing passport with only one name #1

Closed tahajahangir closed 5 years ago

tahajahangir commented 5 years ago

Some passports (TD3 MRZ) have only one name, like:

P<IND<<AHMADI<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
K2578285<7IND5601240F2202288<<<<<<<<<<<<<<<4

or

P<PAKZAHRA<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
DD51243520PAK8501019F27032135440067474356<98

TD3CodeChecker(value) for this input fails with this exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "lib/python3.7/site-packages/mrz/checker/td3.py", line 123, in __init__
    self.result = self._all_hashes() & self._all_fields()
  File "lib/python3.7/site-packages/mrz/checker/_fields.py", line 223, in _all_fields
    self.optional_data &
  File "lib/python3.7/site-packages/mrz/checker/_fields.py", line 92, in identifier
    if check.begin_by(primary, "<") or check.begin_by(secondary, "<"):
  File "lib/python3.7/site-packages/mrz/base/string_checkers.py", line 119, in begin_by
    if string[0] != char:
IndexError: string index out of range
Arg0s1080 commented 5 years ago

Yes, it's correct. That should not happen.

As ICAO 9303-3 3.4 says: "The name of the holder is usually represented in two parts, the primary identifier and the secondary identifier."

So I think that the right thing to do, in case of getting only one identifier, would be to report it as a warning.

Also, thanks to your report, I discovered that something similar happens in mrz.generator. When the primary identifier is not given, an incorrect output is got. For example:

from mrz.generator.td3 import TD3CodeGenerator

print(TD3CodeGenerator("P",           # Document type
                       "Utopia",      # Country
                       "",            # Surname(s)
                       "Anna María",  # Given name(s)
                       "L898902C3",   # Passport number
                       "UTO",         # Nationality
                       "740812",      # Birth date
                       "F",           # Genre
                       "120415",      # Expiry date
                       "ZE184226B"))  # Id number

This wrong output is got:

P<UTO<<ANNA<MARIA<<<<<<<<<<<<<<<<<<<<<<<<<<<
L898902C36UTO7408122F1204159ZE184226B<<<<<10

Next week I'll work to try to fix it

Thank you very much for your feedback

tahajahangir commented 5 years ago

Thanks, Also, the first example MRZ, is an instance of another false-positive, where optional data hash fails when optional-data is empty (all <) and optional-data-hash is < (instead of 0). I think it may be an error according to specs, but exists in real-world.

Arg0s1080 commented 5 years ago

@tahajahangir , hi again!

_fields method at the beginning was very simple... then, little by little, I added more checks. That's why it had several errors. I almost rewritten it from scratch. I've tested it again and I think it works fine now, but if you find something wrong, I would appreciate it if you told me

Thank you so much!

PS: I have created a new issue with the 2nd problem you have reported.