Open Hc747 opened 3 years ago
Please let me know if you're okay with merging this @Arg0s1080. Hope all is well on your end!
Please let me know if you're okay with merging this @Arg0s1080. Hope all is well on your end!
Hi!
Hello
I promise to try to review it this weekend
Please let me know if you're okay with merging this @Arg0s1080. Hope all is well on your end!
Hi!
Hello
I promise to try to review it this weekend
Thank you - no rush! Let me know if there's anything you'd like changed!
Bump @Arg0s1080 :p
Hi again:
Sorry, I'm should have said "next weekend"
I dont know if i understand your problem well, but....
The approach you propose is no-valid.
ICAO specs say:
9303-3
4.6 Convention for Writing the Name of the Holder
[...]
The primary identifier, using the Latin character transliteration (if applicable), shall be written in the MRZ as specified in the form factor specific Parts 4 to 7 of Doc 9303. The primary identifier shall be followed by two filler characters (<<). The secondary identifier, using the Latin character transliteration (if applicable), shall be written starting in the character position immediately following the two filler characters.
If the primary or secondary identifiers have more than one name component, each component shall be separated by a single filler character (<).
Filler characters (<) should be inserted immediately following the final secondary identifier (or following the primary identifier in the case of a name having only a primary identifier) through to the last character position in the machine readable line.
So.. following your sample, its structure should be:
P<XXXAA<<BBBBBB<CCCCCC<DD<<<<<<<<<<<<<<<<<<<
instead:
P<XXXAA<<BBBBBB<<CCCCC<DD<<<<<<<<<<<<<<<<<<<
For example:
More than 2 identifiers:
Primary: AA
Secondary: BBBBBB
Tertiary: CCCCCC DD
#!/usr/bin/python3
# -*- coding: UTF-8 -*-
from mrz.checker.td3 import TD3CodeChecker
check = TD3CodeChecker("P<XXXAA<<BBBBBB<<CCCCC<DD<<<<<<<<<<<<<<<<<<<\n"
"ZE000509<9XXX8501019F2301147<<<<<<<<<<<<<<08")
print("Result:")
print(bool(check))
print()
print("Detected errors:")
errors = check.report.errors
if len(errors) > 0:
print(check.report.errors)
else:
print("None")
Output:
Result: False
Detected errors:
['more than two identifiers', 'false identifier']
If we repair the full name using only 2 identifiers:
Primary: AA
Secondary: BBBBBB CCCCC DD
from mrz.checker.td3 import TD3CodeChecker
check = TD3CodeChecker("P<XXXAA<<BBBBBB<CCCCCC<DD<<<<<<<<<<<<<<<<<<<\n"
"ZE000509<9XXX8501019F2301147<<<<<<<<<<<<<<08")
print("Result:")
print(bool(check))
print()
print("Detected errors:")
errors = check.report.errors
if len(errors) > 0:
print(check.report.errors)
else:
print("None")
Output:
Result:
True
Detected errors:
None
Sorry for the delay and BR
PS: If I have understood something bad tell me
@Arg0s1080 I don't believe you've misunderstood anything! :) Strange however, because I've received an official passport document that does not adhere to this standard and therefore cannot be parsed by this library. The document was in the format specified in the original post of this PR.
It's pretty weird. ICAO specs are quite flexible and leave many things at the discretion of the issuing State, but others are very strict. Its also not very rare to find organizations that do not meet specs.
Out of curiosity, may I know what country it is?
You can modify the code however you want, but the correct thing would be to add a "special case" (India had a similar problem... I think I remember that there were identifiers that started with <<
or something like that) creating a class that overwrite "the official ones"
Due to a design problem, the format of the class name must:
TD1
, TD2
, TD3
OR Passport
CodeGenerator
For example TD1MyNewClassCodeGenertor
or PassportOtherClassNameCodeChecker
Thanks very much for the feedback and solution; will go with that approach. The document was an Indonesian passport document.
Addresses use case where the first line of a valid TD3 MRZ is structured as so:
P<XXXAA<<BBBBBB<<CCCCC<DD<<<<<<<<<<<<<<<<<<<
Whereby the 'A' component is the primary identifier (surname) and the 'B', 'C' and 'D' components are the secondary identifier (name).