Closed marcomatranga closed 5 years ago
Try any of the solutions proposed here.
Hi at the end with sudo apt install python3.6-tk, I was able to install PassportEye, (thanks) but it is not working well.
When i run
mrz --legacy passaporto-michelle-obama.jpg I get this erroror
ERROR: Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract.
Installing Tesseractocr i've installed 2 languages: italian and english. Infact running t
tesseract --list-langs
I get
List of available languages (3): ita osd eng
In my .bashrc file I have defined the following variables
export PATH="$PATH:/usr/bin/tesseract"
export PATH="$PATH:/usr/share/tesseract-ocr/4.00/tessdata"
export TESSDATA_PREFIX="/usr/share/tesseract-ocr/4.00/tessdata"
What is wrong with my setting? thanks
What happens if you remove the --legacy
flag?
Looks like that the eng.trainneed data set was corrupted. I've download it again and now it works .
mrz passaporto-michelle-obama.jpg --legacy
mrz_type None valid False valid_score 0 walltime 0.681032657623291 filename passaporto-michelle-obama.jpg
I've download also the italiian dataset (ita.trainedata) Which is the command (legacy) to use the italian dataset? Thanks.
There is no possibility to specify the language, because, in theory, a natural language model should not be very useful for parsing text of the form >>SMTH>SMTHELSE>>>112>>ETC>
. If anything, it should probably make things worse (unless it was trained on the actual "MRZ language").
This intuition is indirectly confirmed by the fact that --legacy
engine, which relies less on the statistics of natural language sentences it was trained on than the "newer" engine, happens to work better for MRZ parsing as well.
mrz_type None - means that the resolution of the pictures is too low and the algorithm wasn 't able to find he mrz zone? What about the meening of the other outp parameters?
As long as i get output like this
mrz carta_identità_matranga.jpg --legacy
mrz_type None valid False valid_score 0 walltime 0.681032657623291 filename passaporto-michelle-obama.jpg
or it means that the picture is low quality either my algorithm is not worki properly?
mrz_type
None means that the MRZ was not detected or not parsed successfully. It may be due to resolution, due to image rotation, due to complex background pattern, or simply "hard-to-read text".
valid=True
would mean the checksum digits are correct. valid_score
is an ad-hoc number between 0 and 100, denoting algorithm's confidence in the parse.
Try running the script on some of the images in the test/
folder (namely, try running evaluate_mrz
) to verify that you are not always getting zeroes for some technical reason.
Hi i've run the scripts in the test folder, but I did get any output file. Is t because passport eye is wrongly intalled?
The evaluate_mrz
script is not meant to produce output files. It should simply print you the test evaluation results.
I didn't get any test evalution results
In fact, the test data is not included in the compiled package so unless you cloned the git repository you won't see results, indeed.
Try running the following list of commands and see what comes out in your case:
sudo apt install tesseract-ocr
python3 -m venv venv
. venv/bin/activate
git clone https://github.com/konstantint/PassportEye
cd PassportEye
pip install .
evaluate_mrz -j 4 -dd passporteye/mrz/testdata
In my case the output of the last command right now looked as follows:
INFO:evaluate_mrz:Preparing computation for 34 files from passporteye/mrz/testdata/
INFO:evaluate_mrz:Running 4 workers
INFO:evaluate_mrz:Processed 0_id-esp.png in 0.34s (score 0) [=]
INFO:evaluate_mrz:Processed 100_id-che.jpg in 4.28s (score 100) [=]
INFO:evaluate_mrz:Processed 100_id-rou.jpg in 6.43s (score 100) [=]
INFO:evaluate_mrz:Processed 100_id-mac.jpg in 6.92s (score 100) [=]
...
INFO:evaluate_mrz:Processed 79_pass-hun.png in 8.19s (score 59) [<]
INFO:evaluate_mrz:Processed 98_pass-nld.jpg in 5.86s (score 24) [<]
INFO:evaluate_mrz:Completed
Walltime: 57.80s
Compute walltime: 223.05s
Processed files: 34
Perfect parses: 17
Invalid parses: 6
Improved parses: 2
Worsened parses: 9
Total score: 2176
Mean score: 64.00
Mean compute time: 6.56s
Methods used:
rescaled(3): 12
direct: 8
rescaled(1): 6
black_tophat: 1
black_tophat(rescaled(3)): 1
If you see something conceptually different, there must be a problem with the environment somewhere.
These are mine
INFO:evaluate_mrz:Preparing computation for 34 files from passporteye/mrz/testdata INFO:evaluate_mrz:Running 4 workers INFO:evaluate_mrz:Processed 0_id-esp.png in 0.10s (score 0) [=] INFO:evaluate_mrz:Processed 100_id-rou.jpg in 254.08s (score 100) [=] INFO:evaluate_mrz:Processed 0_pass-lva.jpg in 280.97s (score 0) [=] INFO:evaluate_mrz:Processed 100_id-usa.jpg in 155.52s (score 2) [<] INFO:evaluate_mrz:Processed 100_id-mac.jpg in 879.42s (score 75) [<] INFO:evaluate_mrz:Processed 100_id-che.jpg in 898.03s (score 71) [<] INFO:evaluate_mrz:Processed 100_id-si.jpg in 848.18s (score 97) [<] INFO:evaluate_mrz:Processed 100_pass-bdr.jpg in 764.46s (score 61) [<] INFO:evaluate_mrz:Processed 100_pass-cze2.jpg in 555.10s (score 98) [<] INFO:evaluate_mrz:Processed 100_pass-chn.jpg in 816.69s (score 98) [<] INFO:evaluate_mrz:Processed 100_pass-cze.jpg in 943.45s (score 98) [<] INFO:evaluate_mrz:Processed 100_pass-ltu.jpg in 96.24s (score 100) [=] INFO:evaluate_mrz:Processed 100_pass-fin.png in 748.11s (score 98) [<] INFO:evaluate_mrz:Processed 100_pass-hrv.jpg in 456.96s (score 100) [=] INFO:evaluate_mrz:Processed 100_pass-lux.jpg in 467.79s (score 59) [<] INFO:evaluate_mrz:Processed 100_pass-isl.png in 911.10s (score 62) [<] INFO:evaluate_mrz:Processed 100_visa-polx.jpg in 172.24s (score 0) [<] INFO:evaluate_mrz:Processed 100_visa-usa.jpg in 191.31s (score 100) [=] INFO:evaluate_mrz:Processed 100_pass-polx.jpg in 1095.09s (score 62) [<] INFO:evaluate_mrz:Processed 24_pass-egy.jpg in 107.30s (score 0) [<] INFO:evaluate_mrz:Processed 100_pass-uto.jpg in 1075.88s (score 98) [<] INFO:evaluate_mrz:Processed 100_pass2-uto.jpg in 959.32s (score 57) [<] INFO:evaluate_mrz:Processed 27_pass-gbr.jpg in 645.55s (score 59) [>] INFO:evaluate_mrz:Processed 25_pass-uto.jpg in 804.19s (score 62) [>] INFO:evaluate_mrz:Processed 33_id-usa.jpg in 822.94s (score 51) [>] INFO:evaluate_mrz:Processed 62_pass-aus.png in 1.51s (score 0) [<] INFO:evaluate_mrz:Processed 43_pass-fra.jpg in 825.05s (score 98) [>] INFO:evaluate_mrz:Processed 43_pass-twn.jpg in 728.81s (score 61) [>] INFO:evaluate_mrz:Processed 77_card-cmw.png in 437.67s (score 71) [<] INFO:evaluate_mrz:Processed 62_pass-pol.png in 785.09s (score 22) [<] INFO:evaluate_mrz:Processed 79_pass-can.jpg in 514.43s (score 44) [<] INFO:evaluate_mrz:Processed 53_id-d.jpg in 1128.13s (score 73) [>] INFO:evaluate_mrz:Processed 98_pass-nld.jpg in 258.37s (score 61) [<] INFO:evaluate_mrz:Processed 79_pass-hun.png in 430.86s (score 62) [<] INFO:evaluate_mrz:Completed Walltime: 5058.44s Compute walltime: 20059.94s Processed files: 34 Perfect parses: 4 Invalid parses: 5 Improved parses: 6 Worsened parses: 22 Total score: 2100 Mean score: 61.76 Mean compute time: 590.00s Methods used: direct: 16 rescaled(3): 7 rescaled(1): 3 black_tophat: 2
This means the method works as intended, and its the particular image of yours, which is hard to parse.
Hi konstantint, thanks for your great work! The model seems to have a problem to distinguish the '<' from the 'K'. Is there anything you'd recommend me to do? Is there a possibility to get the confidence level per letter or something?
I have tried to install PassportEye on my laptop: Ubuntu 18.04.2 LTS and Python 3.6.7 . First of all, following the instructions on git-hub, I've successfully installed Tesseract OCR. In order to update my $PATH I've put in my .bashrc file the following line:
export PATH=$PATH:/usr/bin/tesseract
Then I've tried to install passport eye with:
pip3 install PassportEye
but it didn't work out. These are the errores
Traceback (most recent call last): File "/home/marco/.local/bin/mrz", line 6, in
from passporteye.mrz.scripts import mrz
File "/home/marco/.local/lib/python3.6/site-packages/passporteye/init.py", line 10, in
from passporteye.mrz.image import read_mrz
File "/home/marco/.local/lib/python3.6/site-packages/passporteye/mrz/image.py", line 14, in
from ..util.geometry import RotatedBox
File "/home/marco/.local/lib/python3.6/site-packages/passporteye/util/geometry.py", line 9, in
from matplotlib import pyplot as plt
File "/home/marco/.local/lib/python3.6/site-packages/matplotlib/pyplot.py", line 2372, in
switch_backend(rcParams["backend"])
File "/home/marco/.local/lib/python3.6/site-packages/matplotlib/pyplot.py", line 207, in switch_backend
backend_mod = importlib.import_module(backend_name)
File "/usr/lib/python3.6/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/home/marco/.local/lib/python3.6/site-packages/matplotlib/backends/backend_tkagg.py", line 1, in
from . import _backend_tk
File "/home/marco/.local/lib/python3.6/site-packages/matplotlib/backends/_backend_tk.py", line 5, in
import tkinter as Tk
ModuleNotFoundError: No module named 'tkinter'
At the end of the try, the packages intalled are following:
scipy-1.2.1.dist-info matplotlib-3.0.3.dist-info numpy-1.16.2.dist-info scikit_learn-0.20.3.dist-info scikit_image-0.14.2.dist-info
They all are up to date
Do you have any idea, on why the installation didn't work out?
Thanks a lot