DoubangoTelecom / ultimateMRZ-SDK

Machine-readable zone/travel document (MRZ / MRTD) detector and recognizer using deep learning
https://www.doubango.org/webapps/mrz/
Other
174 stars 49 forks source link

Parsing of surnames which contain multiple words incorrect #70

Closed enricomiletto closed 2 years ago

enricomiletto commented 2 years ago

Hello, I think there's an error in the way the names are parsed. In the screenshot underneath you can see that the name is not parsed correctly. It should be something like:

surname_0 : DE
surname_1: BRUIJN
given_name_0: WILLEKE
given_name_1: LISELOTTE

It seems like the parser doesn't allow for surnames that comprise of more than 1 word. In the MRZ spec both the "Primary Identifier" and the "Secondary Identifier" can contain multiple words, which are in both cases separated by a single "<", while the Primary and Secondary identifiers are are separated from one another by a "<<".

Thanks in advance. Enrico Immagine 2022-03-04 181633

DoubangoTelecom commented 2 years ago

The SDK doesn't contain a parser, the result from the demo page is informational. The parser is outside of the SDK and open source (https://github.com/DoubangoTelecom/ultimateMRZ-SDK/tree/master/samples/c%2B%2B/parser), you can modify it.