Open MichaelChao02 opened 1 year ago
It happens due to some broken images in MJSynth. A workaround is to set the verify
flag to False to skip the verification process.
https://github.com/open-mmlab/mmocr/blob/d56155c82df3b0a4e859b692acc7fd9a26d760d3/mmocr/datasets/preparers/dumpers/lmdb_dumper.py#L46
But eventually we need to pre-check if the image is empty at the beginning of this method to prevent the fatal error: https://github.com/open-mmlab/mmocr/blob/d56155c82df3b0a4e859b692acc7fd9a26d760d3/mmocr/datasets/preparers/dumpers/lmdb_dumper.py#L55
It happens due to some broken images in MJSynth. A workaround is to set the
verify
flag to False to skip the verification process.But eventually we need to pre-check if the image is empty at the beginning of this method to prevent the fatal error:
Hello, I am having the same issue here. Is there a specific way to pre-check if the image is missing or broken, as you said? For example, is it able to check if imageBuf
is not valid input for cv2.imdecode
so that the verifying part does not break?
Prerequisite
Task
I'm using the official example scripts/configs for the officially supported tasks/models/datasets.
Branch
main branch https://github.com/open-mmlab/mmocr
I'm a little confused about the branch here. I followed the dev1.x insatllation guide but wasn't required to change the branch.
Environment
Reproduces the problem - code sample
There is no customized code.
Reproduces the problem - command or script
Reproduces the problem - error message
Additional information
I downloaded the MjSynth data using academic torrents as the http connection is very slow. The only potential problem I can think of is that the website says the file size is 10.68GB but the file I downloaded is only 9.95 GB. I tried to download multiple times but the results are the same. (If they use 1GB=1000KB to convert the unit, then it makes sense) Once I tried to convert the data to lmdb format, it showed the error message when writing 262000 / 8919273. I tried to do this on multiple devices, and the error pops up at the exact same place. I cannot figure out what causes the problem.
If someone can run the code to competition, maybe he/she can provide me with:
so that I can further inspect the causes.