redhuntlabs / Octopii

An AI-powered Personal Identifiable Information (PII) scanner.
https://redhuntlabs.com/blog/octopii-an-opensource-pii-scanner-for-images.html
Other
643 stars 54 forks source link

Added a Regex Definition for Saudi Arabian Passports #27

Closed othmanalikhan closed 1 year ago

othmanalikhan commented 1 year ago

Added a simple JSON definition for Saudi Arabian passports and validated against 5 live passports.

Interestingly, although not ideal, the keyword "SAU" must be included as the alternative phrase "Kingdom of Saudi Arabia" or "Saudi Arabia" is not reliable enough because of the location/design of the phrase (the phrase is located at the very top of the passport and the font type/colour is inverted/different so isn't always picked up).

Hopefully on the long run an easier keyword to match is specific Arabic words, but for the time being "SAU" is the temporary solution with ideally little false positives.

0x4f53 commented 1 year ago

@othmanalikhan Excellent, this is a great sig! I will add some more for other documents like the Iqamah too!

0x4f53 commented 1 year ago

@othmanalikhan Check out #28 and v2.2!