mindee / doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
https://mindee.github.io/doctr/
Apache License 2.0
3.51k stars 412 forks source link

AttributeError: '_io.BufferedReader' object has no attribute 'length' #508

Closed jonathanMindee closed 2 years ago

jonathanMindee commented 2 years ago

🐛 Bug

Hi!

I retrained two models myself (one for text detection, another one for text recognition). I'm trying to use the load_pretrained_params method from utils to load the checkpoints in an ocr_predictor but get an error.

To Reproduce

My code

from doctr.models import ocr_predictor
from doctr.models.utils import load_pretrained_params

DET_CKPT = "file:///home/.../det_ckpt.zip"
REC_CKPT = "file:///home/.../rec_ckpt.zip"

model = ocr_predictor(det_arch='db_resnet50', reco_arch='sar_resnet31',pretrained=True)
load_pretrained_params(model.det_predictor.model, DET_CKPT)
load_pretrained_params(model.reco_predictor.model, REC_CKPT)

The zip files contain:

├── rec_ckpt.zip
     ├── weights.index
     ├── checkpoint 
     └── weights.data-00000-of-00001
├── det_ckpt.zip
     ├── weights.index
     ├── checkpoint 
     └── weights.data-00000-of-00001

I get the error:

Downloading file:///home/.../det_ckpt.zip to /home/jonathan/.cache/doctr/models/det_ckpt.zip
Traceback (most recent call last):
  File "doctr_retrained_test.py", line 8, in <module>
    load_pretrained_params(model.det_predictor.model, DET_CKPT)
  File "/home/.../lib/python3.6/site-packages/doctr/models/utils/tensorflow.py", line 45, in load_pretrained_params
    archive_path = download_from_url(url, hash_prefix=hash_prefix, cache_subdir='models', **kwargs)
  File "/home/.../lib/python3.6/site-packages/doctr/models/data_utils.py", line 93, in download_from_url
    _urlretrieve(url, file_path)
  File "/home/.../lib/python3.6/site-packages/doctr/models/data_utils.py", line 31, in _urlretrieve
    with tqdm(total=response.length) as pbar:
  File "/usr/lib/python3.6/tempfile.py", line 619, in __getattr__
    a = getattr(file, name)
AttributeError: '_io.BufferedReader' object has no attribute 'length'

Environment

DocTR version: 0.3.1 TensorFlow version: 2.6.0 PyTorch version: N/A (torchvision N/A) OpenCV version: 4.5.3 OS: Ubuntu 18.04.5 LTS Python version: 3.6 Is CUDA available (TensorFlow): Yes Is CUDA available (PyTorch): N/A CUDA runtime version: 11.4.100 GPU models and configuration: GPU 0: NVIDIA GeForce RTX 2080 Ti Nvidia driver version: 470.57.02 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.2.4 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.2.4 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.2.4 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.2.4 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.2.4 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.2.4 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.2.4

jonathanMindee commented 2 years ago

One more info:

Once the zips are stored in the cache, I cannot open them manually (looks like they are not valid).

Also when I re-run my script, I get a new error:

Traceback (most recent call last):
  File "doctr_retrained_test.py", line 8, in <module>
    load_pretrained_params(model.det_predictor.model, DET_CKPT)
  File "/home/.../lib/python3.6/site-packages/doctr/models/utils/tensorflow.py", line 50, in load_pretrained_params
    with ZipFile(archive_path, 'r') as f:
  File "/usr/lib/python3.6/zipfile.py", line 1131, in __init__
    self._RealGetContents()
  File "/usr/lib/python3.6/zipfile.py", line 1198, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
fg-mindee commented 2 years ago

Hi @jonathanMindee :wave:

Thanks for reporting this! I'll need some time to investigate but here are my thoughts:

Best

fg-mindee commented 2 years ago

Any update @jonathanMindee? :pray:

HSH12956 commented 2 years ago

我也遇到了这个问题,请问你解决了吗

fg-mindee commented 2 years ago

Hi @fika321 :wave:

I'm sorry my Chinese skills are rather basic and I cannot read it, would you mind translating to English please?

fg-mindee commented 2 years ago

Closing this as we're not able to reproduce it @jonathanMindee Also this part of the code has been improved in #589, so perhaps this won't happen to you anymore.

Feel free to reopen with additional information if you still encounter the problem!