Closed Arg0s1080 closed 5 years ago
In this case mrz.checker works correctly. However, after seeing several Indian travel documents (Passports, Visas Type A and Visas type B) I have verified that what @tahajahangir explains may happen. For example:
Passports with errors in identifier and optional data hash:
However, in Indian visas identifier is printed correctly Visa MRVB Visa MRVA (although it seems that a final hash is used in this case) (Images got from Google and pixilated to safeguard privacy)
Although it seems clear that this is a bad implementation of ICAO specifications, I think it would be a good idea to add the possibility of disabling checks for certain fields in future mrz.checker versions. (India is not the only country that does not comply with the specifications)
Meanwhile, the only solution I can think of for particular cases is that someone builds a class that inherits from TD1CodeChecker
, TD2CodeChecker
or TD3CodeChecker
and overwrites some property. Recently someone consulted me by mail as adding a check for an additional hash and easily built a class called TD1DutchCodeChecker
, inherited from TD1CodeChecker
, which overwrote optional_data and optional_data_hash properties. Important note: The only requirement to make a child class in MRZ is that it must have this format: <DocumentType>*Code*
. For example: TD1INDCodeChecker
, TD1Type1CodeChecker
, MRVAUKCodeGenerator
, TD2BRACodeGenerator
, TD3CodeCheckerBlahBlah
or similar.
This is the class built for check Dutch Id Cards:
from ..base.countries_ops import *
from ..base.functions import hash_is_ok
from .td1 import TD1CodeChecker
import mrz.base.string_checkers as check
__all__ = ["TD1DutchCodeChecker", "code_list", "countries_list", "countries_code_list", "code_country_list",
"is_country", "is_code", "get_code", "get_country", "find_country"]
class TD1DutchCodeChecker(TD1CodeChecker):
"""
Check the string code of the machine readable zone for dutch TD1
__bool__() returns True if all fields are validated, False otherwise
Params:
mrz_string (str): MRZ string of td1s. Must be 90 uppercase characters long
check_expiry (bool): If it's set to True, it is verified and reported as warning that the
document is not expired and that expiry_date is not greater than 10 years
compute_warnings (bool): If it's set True, warnings compute as False
"""
def __init__(self, mrz_code: str, check_expiry=False, compute_warnings=False):
TD1CodeChecker.__init__(self, mrz_code, check_expiry, compute_warnings)
@property
def optional_data(self) -> bool:
"""Return True if the format of the optional data field is validated, False otherwise."""
s = self._optional_data
return True if check.is_empty(s) else self._report("id number format", check.is_printable(s))
@property
def optional_data_hash(self):
self._optional_data_hash = self.mrz_code.splitlines()[0][29]
self._optional_data = self.mrz_code.splitlines()[0][15: 29]
return self._report("id number hash", hash_is_ok(self._optional_data, self._optional_data_hash))
def _all_hashes(self) -> bool:
return (self.final_hash &
self.document_number_hash &
self.birth_date_hash &
self.expiry_date_hash &
self.optional_data_hash)
Usage:
from mrz.checker.td1_dutch import TD1DutchCodeChecker
mrz_code = ("I<NLDARRE84NB20123456789<<<<<7\n" # '7' is an aditional hash not included in TD1 ICAO specifications
"9901236M3012235NLD<<<<<<<<<<<2\n"
"SMITH<<JOHN<JOEY<<<<<<<<<<<<<<")
checker = TD1DutchCodeChecker(mrz_code)
print("Check: %s" % checker)
Output:
Check: True
@tahajahangir this is a quick and simple solution for Indian passports:
import mrz.base.string_checkers as check
from mrz.checker.td3 import *
from mrz.checker._honorifics import titles
from mrz.base.functions import hash_is_ok
class TD3INDCodeChecker(TD3CodeChecker):
@property
def optional_data_hash(self) -> bool:
"""Return True if hash of optional data is True, False otherwise."""
if check.is_empty(self._optional_data) and self._optional_data_hash == "<":
ok = True
else:
ok = hash_is_ok(self._optional_data, self._optional_data_hash)
return self._report("optional data hash", ok)
@property
def identifier(self) -> bool:
"""Return True is the identifier is validated overcoming the checks, False otherwise."""
full_id = self._identifier.rstrip("<")
padding = self._identifier[len(full_id):]
id2iter = full_id.split("<<")
id_len = len(id2iter)
primary = secondary = None
if not check.is_printable(self._identifier):
ok = False
elif check.is_empty(self._identifier):
self._report("empty identifier", kind=2)
ok = False
elif check.uses_nums(full_id):
self._report("identifier with numbers", kind=2)
ok = False
else:
if full_id.startswith("<<"):
id2iter = id2iter[1:]
id_len = len(id2iter)
if id_len == len([i for i in id2iter if i]):
if id_len == 2:
primary, secondary = id2iter
ok = True
elif id_len == 1:
primary, secondary = id2iter[0], ""
self._report("only one identifier", kind=1)
ok = not self._compute_warnings
else:
self._report("more than two identifiers", kind=2)
ok = False
else: # too many '<' in id
self._report("invalid identifier format", kind=2)
ok = False
else: # if the identifier MUST starts with "<<" it is reported as error and ok is set to False
# IMPORTANT: I don't know real requirements
self._report("identifier doesn't begin by '<<", kind=2)
ok = False
# print("Debug. id2iter ............:", id2iter)
# print("Debug. (secondary, primary):", (secondary, primary))
# print("Debug. padding ............:", padding)
if ok:
if primary.startswith("<") or secondary and secondary.startswith("<"):
self._report("some identifier begin by '<'", kind=2)
ok = False
if not padding:
self._report("possible truncating", kind=1)
ok = False if self._compute_warnings else ok
for i in range(id_len):
for itm in id2iter[i].split("<"):
if itm:
for tit in titles:
if tit == itm:
if i: # secondary id
self._report("Possible unauthorized prefix or suffix in identifier", kind=1)
else: # primary id
self._report("Possible not recommended prefix or suffix in identifier", kind=1)
ok = False if self._compute_warnings else ok
return self._report("identifier", ok)
Usage:
mrz_code = ("P<IND<<AHMADI<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<\n"
"K2578285<7IND5601240F2202288<<<<<<<<<<<<<<<4")
passport_check = PassportINDCodeChecker(mrz_code)
print("CHECK...:%s" % passport_check)
print("WARNINGS:%s" % passport_check.report_warnings)
Output:
CHECK...:True
WARNINGS:['only one identifier']
Best regards
Hi, I was looking into something similar and faced the same issue for indian passport MRZ, could you let me know where is the PassportINDCodeChecker() is defined in the above usage snippet .
Hi @ShadabShariff
That snippet is just an example (it's done very quick, so I'm sure it can be improved)
Just copy&paste the text into a file (eg td3_india.py), save and use it.
Optionally it can be installed and used with mrz. Just copy td3_indian.py in mrz/checker folder and execute setup.py
For example, in Linux it could be done like this:
git clone https://github.com/Arg0s1080/mrz.git
cp td3_indian.py ~/mrz/mrz/checker/
cd mrz
sudo python3 setup.py
and then:
from mrz.checker.td3_india import TD3INDCodeChecker
mrz_code = ("P<IND<<AHMADI<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<\n"
"K2578285<7IND5601240F2202288<<<<<<<<<<<<<<<4")
passport_check = TD3INDCodeChecker(mrz_code)
print("CHECK...:%s" % passport_check)
print("WARNINGS:%s" % passport_check.report_warnings)
Regards!
@ShadabShariff maybe so be better:
import mrz.base.string_checkers as check
from mrz.checker.td3 import *
from mrz.checker._honorifics import titles
from mrz.base.functions import hash_is_ok
from string import ascii_uppercase
class PassportINDCodeChecker(TD3CodeChecker):
@property
def optional_data_hash(self) -> bool:
"""Return True if hash of optional data is True, False otherwise."""
if check.is_empty(self._optional_data) and self._optional_data_hash == "<":
ok = True
else:
ok = hash_is_ok(self._optional_data, self._optional_data_hash)
return self._report("optional data hash", ok)
@property
def identifier(self) -> bool:
"""Return True is the identifier is validated overcoming the checks, False otherwise."""
full_id = self._identifier.rstrip("<")
padding = self._identifier[len(full_id):]
id2iter = full_id.lstrip("<<").split("<<") if full_id[2] in ascii_uppercase else full_id.split("<<")
id_len = len(id2iter)
primary = secondary = None
if not check.is_printable(self._identifier):
ok = False
elif check.is_empty(self._identifier):
self._report("empty identifier", kind=2)
ok = False
else:
if id_len == len([i for i in id2iter if i]):
if id_len == 2:
primary, secondary = id2iter
ok = True
elif id_len == 1:
primary, secondary = id2iter[0], ""
self._report("only one identifier", kind=1)
ok = not self._compute_warnings
else:
self._report("more than two identifiers", kind=2)
ok = False
else: # too many '<' in id
self._report("invalid identifier format", kind=2)
ok = False
# print("Debug. id2iter ............:", id2iter)
# print("Debug. (secondary, primary):", (secondary, primary))
# print("Debug. padding ............:", padding)
if ok:
if not full_id.startswith("<<"):
self._report("identifier doesn't starts with '<<'", kind=2)
ok = False
# If you want to report as warning instead of as error uncomment lines below
# self._report("identifier doesn't starts with '<<'", kind=1)
# ok = False if self._compute_warnings else ok
if check.uses_nums(full_id):
self._report("identifier with numbers", kind=2)
ok = False
if primary.startswith("<") or secondary and secondary.startswith("<"):
self._report("some identifier begin by '<'", kind=2)
ok = False
if not padding:
self._report("possible truncating", kind=1)
ok = False if self._compute_warnings else ok
for i in range(id_len):
for itm in id2iter[i].split("<"):
if itm:
for tit in titles:
if tit == itm:
if i: # secondary id
self._report("Possible unauthorized prefix or suffix in identifier", kind=1)
else: # primary id
self._report("Possible not recommended prefix or suffix in identifier", kind=1)
ok = False if self._compute_warnings else ok
return self._report("identifier", ok)
and then:
from mrz.checker.td3_india import PassportINDCodeChecker
mrz_code = ("P<IND<<AHMADI<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<\n"
"K2578285<7IND5601240F2202288<<<<<<<<<<<<<<<4")
passport_check = PassportINDCodeChecker(mrz_code)
print("CHECK...:%s" % passport_check)
print("WARNINGS:%s" % passport_check.report_warnings)
P.S.: Read line 53
Hi @Arg0s1080
Thanks for pointing out the line 53, The Indian version of passport has the identifier starting with or without '<<'
We went with un-commenting the code which you pointed out, however just wanted to put up your code which we uncommented and commented for understanding and to know if it is fine :
if ok:
if not full_id.startswith("<<"):
# self._report("identifier doesn't starts with '<<'", kind=2)
# ok = False
# If you want to report as warning instead of as error uncomment lines below
self._report("identifier doesn't starts with '<<'", kind=1)
ok = False if self._compute_warnings else ok
if check.uses_nums(full_id):
self._report("identifier with numbers", kind=2)
ok = False
if primary.startswith("<") or secondary and secondary.startswith("<"):
self._report("some identifier begin by '<'", kind=2)
ok = False
if not padding:
self._report("possible truncating", kind=1)
ok = False if self._compute_warnings else ok
for i in range(id_len):
for itm in id2iter[i].split("<"):
if itm:
for tit in titles:
if tit == itm:
if i: # secondary id
self._report("Possible unauthorized prefix or suffix in identifier", kind=1)
else: # primary id
self._report("Possible not recommended prefix or suffix in identifier", kind=1)
ok = False if self._compute_warnings else ok
return self._report("identifier", ok)
Also, just wanted to confirm if the name is changed there is no hash and checksum evaluation for the identifier for a given document id in general ?
Hi again @ShadabShariff
Yes, perfect. Two lines above must be commented (I forgot it). If you dont want to report it as error (kind=2) or as warning (kind=1) just delete the block:
if not full_id.startswith("<<"):
# self._report("identifier doesn't starts with '<<'", kind=2)
# ok = False
# If you want to report as warning instead of as error uncomment lines below
self._report("identifier doesn't starts with '<<'", kind=1)
ok = False if self._compute_warnings else ok
(it should work too)
No, the identifier never computes for checksums (Passports, Visas, etc). It's really curious. I think the same as you: it should have its own hash or at least compute for final hash.
Ups! In a previous comment I forgot 'install' param for installation using setup.py:
python3 setup.py install
Also, the first example MRZ, is an instance of another false-positive, where
optional data hash
fails when optional-data is empty (all<
) and optional-data-hash is<
(instead of0
). I think it may be an error according to specs, but exists in real-world.Originally posted by @tahajahangir in https://github.com/Arg0s1080/mrz/issues/1#issuecomment-439771574