Text detection takes longer time to return the results around 6-8 seconds even in GPU mode

kalai2033 commented 4 years ago

Hi, I am using your pre trained model craft_mlt_25k.pth for text detection. i have modified the test.py as per my use case and it always process only one image for single call. In cpu mode it takes on average of 12 to 14 seconds to process the single image(480*640) and in GPU (google colab) it takes around 7 seconds.

Especially, the call to forward pass (y, _ = net(x)) to craft network takes longer time to return the propability tensor.

Is there any way that i can speed it up? Thanks in advance.

piernikowyludek commented 4 years ago

I encounter the same issue - the inference pass through the network takes on average 6 seconds (on a CPU in my case).

More alarmingly - with each image the system memory usage is higher and higher, as if some tensors are getting added to the graph and not released with every call? Has anyone else encountered this issue?

kalai2033 commented 4 years ago

It seems the model used in the below link is faster... I think the model used here is different. It would be great if the provided pre trained model craft_mlt_25k.pth is as fast as this one.

https://demo.ocr.clova.ai

p9anand commented 4 years ago

have faced the same issue. after each images memory consumption is increasing. any solution yet?

YoungminBaek commented 4 years ago

In CPU mode, it takes a long time, but I have no idea why the inference speed is slow in GPU mode. I think there are so many factors such as GPU hardware, driver version, installed modules, and other environmental factors. Also, I was not aware of memory leakage of our model. I'll look into the memory leakage in both CPU and GPU modes. If anyone can find the problem, please let me know.

p9anand commented 4 years ago

Did you get the chance to have a look at why during inference time memory consumption is keeps on increasing if you pass multiple images one by one.

piernikowyludek commented 4 years ago

@p9anand @YoungminBaek I have some insight into the memory leakage. I initially installed torch version 1.2 (the newest) and encountered the error while running inference on cpu. The problem has disappeared for me after downgrading to pytorch version 0.4.1 (as listed in the requirements of this repo)! It might be a pytorch problem / incompatibility. If you have this problem - try other versions of the library :p

p9anand commented 4 years ago

@piernikowyludek : Thanks. It worked. Now there is no memory leakage on inference.

kalai2033 commented 4 years ago

Still there is no change in time taken for forward pass . It still takes a long time even after downgrading to pytorch version 0.4.1 .

kalai2033 commented 4 years ago

In CPU mode, it takes a long time, but I have no idea why the inference speed is slow in GPU mode. I think there are so many factors such as GPU hardware, driver version, installed modules, and other environmental factors. Also, I was not aware of memory leakage of our model. I'll look into the memory leakage in both CPU and GPU modes. If anyone can find the problem, please let me know.

@YoungminBaek Hi, Were you able to look at the issue why it takes long time for text detection in cpu and gpu.

zcswdt commented 4 years ago

I tested it with an image of 800 * 800. Each image takes about 0.8s. I located the line y[0,:,:,1].cpu().data.numpy() in test.py. . Is there any way to solve the problem of long test time? thank you very much.

zcswdt commented 4 years ago

@kalai2033 Can you send me the model of the online demo you provided? Looking forward to your reply, thank you

hermanius commented 4 years ago

@YoungminBaek I downgraded pytorch to 0.4.1. but I still have the memory leak (in CPU mode). When I process the same image multiple times memory consumption seems stable, but increases rapidly when you start inferencing different images. Also inference takes around 8 seconds... Any progress with solving the memory leak?

HanelYuki commented 4 years ago

I have solved the problem by loading the "vgg" pretrained model. If not loaded, it will pass params via http protocol ( it's too slow for web request)

hermanius commented 4 years ago

@HanelYuki The problem of slow inference or the memory leak problem? Which pytorch version?

HanelYuki commented 4 years ago

@hermanius slow inference

hermanius commented 4 years ago

@HanelYuki I'v downloaded None-VGG-BiLSTM-CTC.pth model and used it to replace craft_mlt_25k.pth as --trained_model to run test.py, but I get errors. Obviously I'm missing something. Can you help me out?

Loading weights from checkpoint (None-VGG-BiLSTM-CTC.pth) Traceback (most recent call last): File "mmi_craft.py", line 145, in net.load_state_dict(copyStateDict(torch.load(args.trained_model, map_location='cpu'))) File "/root/miniconda/envs/craft/lib/python3.6/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for CRAFT: Missing key(s) in state_dict: "basenet.slice1.0.weight", "basenet.slice1.0.bias", "basenet.slice1.1.weight", "basenet.slice1.1.bias", "basenet.slice1.1.running_mean", "basenet.slice1.1.running_var", "basenet.slice1.3..........

HanelYuki commented 4 years ago

@hermanius You can run craft.py with 'predicted = true' at line 8,, and then you will find some helpful errors

ShroukMansour commented 4 years ago

@piernikowyludek @p9anand I've downgraded as you stated, the problem of memory leak disappeared, but now I've very long testing time which is over than 30 seconds!! I can't discover the reason!! I'm on python 3.7 with same environmental requirements in the requirements.txt file

ShroukMansour commented 4 years ago

@zcswdt What your environments libraries version and your python version? cpu or gpu? It takes over than 30s with me!

SabraHashemi commented 3 years ago

predicted

i tried your answer its not correct (sabra) D:\CRAFT-pytorch-master>py craft.py Traceback (most recent call last): File "craft.py", line 83, in model = CRAFT(predicted=True).cuda() TypeError: init() got an unexpected keyword argument 'predicted'

@HanelYuki

waflessnet commented 3 years ago

@hermanius
sorry but craft_mlt_25k.pth it for detect word in scene and None-VGG-BiLSTM-CTC.pth it for recognition word. on diferent functions

https://github.com/clovaai/deep-text-recognition-benchmark

hermanius commented 3 years ago

@waflessnet Thanks for the info. I use craft for text detection now, and tesseract for OCR. When I have the time I will try vgg again for OCR.

mohammedayub44 commented 3 years ago

@hermanius Did you happen to solve the problem ? if you can share any thoughts. I just started to use this repo, running into similar problems as stated above. Thanks!

manhcuogntin4 commented 3 years ago

I got the same issue for time with memory consumption is increasing. Anyone have solution for this problem ?

SaeedArisha commented 3 years ago

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

SabraHashemi commented 3 years ago

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

SaeedArisha commented 3 years ago

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

No, amazon's ec2 instance (g4dn.xlarge)

SabraHashemi commented 3 years ago

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

No, amazon's ec2 instance (g4dn.xlarge)

anyway 6 to 8 s or even 3 second for an image is too much , try another system.(local)

SaeedArisha commented 3 years ago

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

No, amazon's ec2 instance (g4dn.xlarge)

anyway 6 to 8 s or even 3 second for an image is too much , try another system.(local)

I don't have a local machine that has GPU and it's taking 6s on colab as well, I really need help. Can you guide me on which machine I should use?

SabraHashemi commented 3 years ago

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

No, amazon's ec2 instance (g4dn.xlarge)

anyway 6 to 8 s or even 3 second for an image is too much , try another system.(local)

I don't have a local machine that has GPU and it's taking 6s on colab as well, I really need help. Can you guide me on which machine I should use?

i ran on these gpues: nvidia gtx 1050 , 980 ti, 780ti and mx130 exe time is increasing in above order. i suggest to use 980ti expected exe time is about 40ms -100ms

SabraHashemi commented 3 years ago

I ran craft on gpu but after 3 hours i get memory leak with cpu, i'm using pytorch 1.6 did you all have same problem? ran on gpu and get memory leak on cpu? what should i do? please kind help :) @waflessnet @mohammedayub44 @p9anand @YoungminBaek @hermanius

jso4342 commented 1 year ago

Any update on this? I'm also using GPU and it takes around 6 sec to process one image. Also for the call to forward pass (y, _ = net(x)) takes longer.

Couldn't find the solution...

clovaai / CRAFT-pytorch

Text detection takes longer time to return the results around 6-8 seconds even in GPU mode #45