Closed Ao-Lee closed 2 years ago
I just tried. Everything seems fine. Both on cpu and gpu inference.
(base) rowel@atienza-G190-G30:\~/github/roatienza/deep-text-recognition-benchmark$ python3 infer.py --image demo_image/demo_2.jpg --model https://github.com/roatienza/deep-text-recognition-benchmark/releases/download/v0.1.0/vitstr_small_patch16_jit.pt
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 82.1M/82.1M [00:01<00:00, 84.0MB/s]
/home/rowel/anaconda3/lib/python3.7/site-packages/torch/serialization.py:709: UserWarning: 'torch.load' received a zip file that looks like a TorchScript archive dispatching to 'torch.jit.load' (call 'torch.jit.load' directly to silence this warning)
" silence this warning)", UserWarning)
demo_image/demo_2.jpg : SHAKESHACK
(base) rowel@atienza-G190-G30:\~/github/roatienza/deep-text-recognition-benchmark$ python3 infer.py --image demo_image/demo_2.jpg --model vitstr_small_patch16_jit.pt
/home/rowel/anaconda3/lib/python3.7/site-packages/torch/serialization.py:709: UserWarning: 'torch.load' received a zip file that looks like a TorchScript archive dispatching to 'torch.jit.load' (call 'torch.jit.load' directly to silence this warning)
" silence this warning)", UserWarning)
demo_image/demo_2.jpg : SHAKESHACK
(base) rowel@atienza-G190-G30:\~/github/roatienza/deep-text-recognition-benchmark$ python3
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
1.11.0+cu113
>>>
(base) rowel@atienza-G190-G30:\~/github/roatienza/deep-text-recognition-benchmark$ python3 infer.py --image demo_image/demo_2.jpg --gpu --model vitstr_small_patch16_jit.pt
/home/rowel/anaconda3/lib/python3.7/site-packages/torch/serialization.py:709: UserWarning: 'torch.load' received a zip file that looks like a TorchScript archive dispatching to 'torch.jit.load' (call 'torch.jit.load' directly to silence this warning)
" silence this warning)", UserWarning)
demo_image/demo_2.jpg : SHAKESHACK
thx. I can update torch from 1.8 to 1.11.0 and try again
would kindly like to ask if requirements.txt can be update to your current environment setting please?
all problems are solved with pytorch 1.11.0, many thanks, plz close this issue
@Ao-Lee
it seems that the function model = torch.load(checkpoint) in infer.py returns an ordered dict instead of the model object. One way to solve the problem is:
ordered_dict = torch.load(checkpoint) model.load(ordered_dict )
How can you create model
in infer.py
. Can you show me how to that? I want to infer from my checkpoint but I was stuck there
Thank you so much.
It is triggered by --infer-model
option. The pytorch doc for JIT is also indicated.
In
test.py
,It is triggered by
--infer-model
option. The pytorch doc for JIT is also indicated.
Thank you so much! My problem is solved
Should probably update requirements.txt to torch=1.1.0
@raisinbl, hi! Please, could you explain to me how you solved this problem? I've already tried to initialize the architecture of the tiny version in the following way:
vitstr_tiny = ViTSTR(patch_size=16, embed_dim=192, depth=12, num_heads=3, mlp_ratio=4, qkv_bias=True, in_chans=1)
See the original code here:
https://github.com/roatienza/deep-text-recognition-benchmark/blob/fb06d18bde4e62e728208ba3274390b8a615418a/modules/vitstr.py#L156-L159
I've also organized the whole preparation process before the evaluation as it's described here: https://github.com/roatienza/deep-text-recognition-benchmark/blob/fb06d18bde4e62e728208ba3274390b8a615418a/test.py#L238-L242
In other words, my setup before the model.eval()
call in the infer.py
script looks as follows at this point (model
is substituted for vitstr_tiny
; and model = torch.load("vitstr_tiny_patch16_224.pth")
):
vitstr_tiny = ViTSTR(patch_size=16, embed_dim=192, depth=12, num_heads=3, mlp_ratio=4, qkv_bias=True, in_chans=1)
new_state_dict = get_state_dict(model)
vitstr_tiny.load_state_dict(new_state_dict)
vitstr_tiny.eval()
However, at first, I'd faced a range of errors related to the key names in the new_state_dict
dictionary. Later, I fixed them by changing name = k[7:]
to name = k[14:]
in the function below:
https://github.com/roatienza/deep-text-recognition-benchmark/blob/fb06d18bde4e62e728208ba3274390b8a615418a/test.py#L228-L234
The aim of such an act was to modify the key names of the dictionary in order to process it correctly, according to what the ViTSTR model waits for.
Anyway, I still see errors, but now they are related to size mismatching. My guess is that there's something wrong either with the .pth
file for the ViTSTR-Tiny model (vitstr_tiny_patch16_224.pth
) or with my setup of hyperparameters in the vitstr_tiny
variable. See an example of the error:
RuntimeError: Error(s) in loading state_dict for ViTSTR:
size mismatch for head.weight: copying a param with shape torch.Size([96, 192]) from checkpoint, the shape in current model is torch.Size([1000, 192]).
size mismatch for head.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([1000]).
Interestingly, I've noticed that I didn't manage to run the inference neither of pretrained models that have the .pth
extension (in our case, it turns out that they are representatives of the collections.OrderedDict
class). But a few remaining models with the .pt
extension run successfully without any modification in infer.py
. Needless to say, I've set up my environment according to the requirements.txt
file.
@roatienza, I hope you're doing great! I would really appreciate it if you could look into the aforementioned issues and elaborate on them a little bit as well. Another surprising thing I've seen was that there are no implementations of ViTSTR-Tiny in OCR-related frameworks such as docTR, for example. I bet the reason might be the difficulty of this model reproducibility.
Okay, I just looked at the README.md
file and a few other closed issues more thoroughly, and, if I got it correctly, pretrained models with the .pth
extension are not suitable for the infer.py
script. Put differently, one should use test.py
so as to run models with such an extension; the infer.py
script is only for .pt
models, so to speak. I'm going to check it out a bit later and get back with some feedback right away.
Yeah, everything works correctly using the prompt from here. So, the conclusion is that if you want to run the inference of a pretrained model that has the .pth
file extension, you should use the test.py
script in order to do that.
@artyommatveev Have you written the predict code for the .pth model yet? Can you give me a reference to it? Thank you.
Hello, I used single GPU env with python == 3.8, torch==1.8.1 and torchvision==0.9.1 I followed the github hint with the following command:
It returned an error with
it seems that the function model = torch.load(checkpoint) in infer.py returns an ordered dict instead of the model object. One way to solve the problem is:
But I do not know the hyper params of vitstr_small_patch16_224.pth when it is training. so it is very hard form me to initialize the model object with correct hyper params. I would like to ask would it possible to may the hyper params of the pretrained models public?
I also tried the pt models
it gives the following error:
any way to load the model correctly please? may thanks