Open luke4u opened 4 years ago
+1 for this. Ensuring a GPU is available in a production environment in the cloud can be a real nuisance. Also since MMdetection 2.0 there is support for CPU-only mode. So if someone is able to reproduce or convert the model to mmdetection 2.0-compatible format, then this model can be used for inference in a CPU-only environment. The nice part is that training can still be done with GPU, but the resulting checkpoints will be able to load and run in a CPU-only environment too.
See also this page on cpu-only mode and this page on upgrading from 1.x to 2.0. Unfortunately I wasn't able to succesfully convert the model myself using the provided conversion tool. Hopefully the creator could help out and provide trained models compatible with mmdetection 2.0.
Since the creator of issue #77 mentioned he was able to convert the model (but unfortunately did not share his config or conversion steps), I decided to give it another shot myself. Succesfully this time.
I would like to refer you all to my branch at iiLaurens/CascadeTabNet:mmdet2x. It includes a demo notebook on how to run using mmdetection v2.3.0 in a cpu only colab environment. You can find that notebook here. All checkpoint files can be found under the releases on this page. Happy inferencing!
Hi @iiLaurens , thank you for sharing the workflow!
Noticed you are using mmcv-full==1.0.5
There seems no distribution available for the Windows platform in below link, and mmcv-full relies on CUDA? (correct me if I am wrong).
https://openmmlab.oss-accelerate.aliyuncs.com/mmcv/dist/index.html
I had to install mmcv=1.0.5
, but ran into an error ModuleNotFoundError: No module named 'mmcv._ext'
Btw, do you manage to run the model on a Windows platform with only CPU?
As far as I know there is no windows version for mmcv-full. And as you noticed mmcv simply doesn't work at all. I run in Linux environment.
On Fri, Sep 25, 2020, 12:33 Luke notifications@github.com wrote:
Hi @iiLaurens https://github.com/iiLaurens , thank you for sharing the workflow! Noticed you are using mmcv-full==1.0.5 There seems no distribution available for the Windows platform in below link, and mmcv-full relies on CUDA? (correct me if I am wrong).
https://openmmlab.oss-accelerate.aliyuncs.com/mmcv/dist/index.html
I had to install mmcv=1.0.5, but ran into an error ModuleNotFoundError: No module named 'mmcv._ext' Btw, do you manage to run the model on a Windows platform with only CPU?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DevashishPrasad/CascadeTabNet/issues/71#issuecomment-698855268, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLUZ5I2CILVSA5G6BEZIBTSHRWUHANCNFSM4QFKPUUA .
hi @iiLaurens you only converted the models or after converting you have trained for some epochs? i am able to convert the model but its output is not perfect as your model.
I did not do any further training, just converting. If my memory serves me correctly, I had to convert both the model and the config file. Did you convert both?
On Mon, Nov 2, 2020, 08:32 Kumar Rajwani notifications@github.com wrote:
hi @iiLaurens https://github.com/iiLaurens you only converted the models or after converting you have trained for some epochs? i am able to convert the model but its output is not perfect as your model.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DevashishPrasad/CascadeTabNet/issues/71#issuecomment-720292477, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLUZ5OQFT7VPVVCITJHUJLSNZN65ANCNFSM4QFKPUUA .
No, i have only converted model and i am using config file of mmdetection version 2 which is compatible with the model.
How to convert the model and the config file to mmdetection version 2 from version 1?
i have done something like this
import torch
checkpoint = torch.load("/content/epoch_36.pth")
## remove the path which giving error while conversion
checkpoint['meta']['config'] = checkpoint['meta']['config'].replace("/content/drive/My Drive/chunk cascade_mask_rcnn_hrnetv2p_w32_20e.py\n","")
torch.save(checkpoint, "/content/epoch_35.pth")
##convert
!python mmdetection/tools/upgrade_model_version.py /content/epoch_35.pth /content/epoch_37.pth --num-classes 81
##detection
from mmdet.apis import init_detector, inference_detector, show_result_pyplot
import mmcv
# Load model
config_file = '/content/mmdetection/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w32_20e_coco.py'
checkpoint_file = '/content/epoch_37.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
# Test a single image
img = "/content/5.29.2020 COI - Corvias Construction Partners, LLC_0001.jpg"
# Run Inference
result = inference_detector(model, img)
# Visualization results
show_result_pyplot(model, img, result, score_thr=0.85)
Since the creator of issue #77 mentioned he was able to convert the model (but unfortunately did not share his config or conversion steps), I decided to give it another shot myself. Succesfully this time.
I would like to refer you all to my branch at iiLaurens/CascadeTabNet:mmdet2x. It includes a demo notebook on how to run using mmdetection v2.3.0 in a cpu only colab environment. You can find that notebook here. All checkpoint files can be found under the releases on this page. Happy inferencing!
@iiLaurens thanks for this effort. Does that also mean I can use CasCadetabnet architecture with my already installed mmdetection v2.3 even when the network was trained on mmdetection v1.2?
@iiLaurens Thank you soo much for your work. only thing I changed to work on my cpu is run this !pip install mmcv-full==1.0.5 -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.5.0/index.html
instead of this !pip install mmcv-full==1.0.5+torch1.5.0+cpu -f https://openmmlab.oss-accelerate.aliyuncs.com/mmcv/dist/index.html
I fine-tune/trained a model and was able to upgrade using mmdetection/tools/upgrade_model_version.py
, and use @iiLaurens config and run both init_detector, and inference_detector using following package setup.
mmcv-full==1.0.5
mmdet==2.3.0
numpy==1.21.3
opencv-python==4.5.4.58
pycocotools==2.0.2
torch==1.5.1+cpu
torchvision==0.6.1+cpu
However in my CPU version from my checkpoint for inference I get back all empty arrays for the 81 classes. Only difference is that I started General Model table detection
link check point and I trained with original config.
If anyone has some ideas of what to try or change would greatly appreciate it.
UPDATE: In case it helps anyone who is also fine-tuning their model, I can't upgrade a model I fine tuned in mmdet 1.2 upgrade it and train w/ mmdet > 2 or infer on CPU from it. I was able to upgrade their checkpoint and train and infer on CPU (I used General Model table detection epoch_24.pth) If it is possible please let me know.
@iiLaurens , Thank you for your work. Is this possible to run your notebook or reproduce your result on a local windows environment? I tried and failed to install the requirements, and it was similar to @luke4u. If it is not possible to use reproduce on windows, could you share the Linux environment details, or suggest the necessary packages to build a docker file for it.
Thank you for your time, also thanks in advance if anyone could help out with some ideas.
I was able to get it to run from docker container (for use in AWS Lambda). This is the dockerfile:
FROM public.ecr.aws/lambda/python:3.8
RUN yum -y install gcc mesa-libGL
RUN pip install \
torch==1.6.0+cpu \
torchvision==0.7.0+cpu \
-f https://download.pytorch.org/whl/torch_stable.html \
&& rm -rf /root/.cache/pip
RUN pip install \
mmdet==2.3.0 \
pycocotools==2.0.2 \
requests
RUN pip install mmcv-full==1.0.5 -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.6.0/index.html
And you need the converted checkpoint and config files that you can find in my repo.
Then some code like this should make it work:
from mmdet.apis import inference_detector, init_detector
config = '/pdfextract/cascadeTabNet/cascade_mask_rcnn_hrnetv2p_w32_20e.py'
checkpoint = '/pdfextract/cascadeTabNet/General.Model.table.detection.v2.pth'
model = init_detector(config, checkpoint, device='cpu')
results = inference_detector(model, img)
Thanks so much for your suggestions. I will give it a try to build a similar ubuntu container for running it on the local windows.
@iiLaurens ,
Thanks a lot again. I just want to let you know that I was able to build a running ubuntu container on windows thanks to your suggestion.
Now, I could get the inference results without any problems on windows with just the CPU. Awesome work!
@iiLaurens ,
Thanks a lot again. I just want to let you know that I was able to build a running ubuntu container on windows thanks to your suggestion.
Now, I could get the inference results without any problems on windows with just the CPU. Awesome work!
can you please elaborate your steps
Hi Folks I admire the work of @iiLaurens and appreciate the team. However I'm find error as below. Please I request any of you to resolve this issues it would be highly be appreciated. As I'm using colab notebook with cpu ERROR: Could not find a version that satisfies the requirement torch==1.5.1+cpu (from versions: 1.11.0, 1.11.0+cpu, 1.11.0+cu102, 1.11.0+cu113, 1.11.0+cu115, 1.11.0+rocm4.3.1, 1.11.0+rocm4.5.2, 1.12.0, 1.12.0+cpu, 1.12.0+cu102, 1.12.0+cu113, 1.12.0+cu116, 1.12.0+rocm5.0, 1.12.0+rocm5.1.1, 1.12.1, 1.12.1+cpu, 1.12.1+cu102, 1.12.1+cu113, 1.12.1+cu116, 1.12.1+rocm5.0, 1.12.1+rocm5.1.1, 1.13.0, 1.13.0+cpu, 1.13.0+cu116, 1.13.0+cu117, 1.13.0+cu117.with.pypi.cudnn, 1.13.0+rocm5.1.1, 1.13.0+rocm5.2, 1.13.1, 1.13.1+cpu, 1.13.1+cu116, 1.13.1+cu117, 1.13.1+cu117.with.pypi.cudnn, 1.13.1+rocm5.1.1, 1.13.1+rocm5.2, 2.0.0, 2.0.0+cpu, 2.0.0+cpu.cxx11.abi, 2.0.0+cu117, 2.0.0+cu117.with.pypi.cudnn, 2.0.0+cu118, 2.0.0+rocm5.3, 2.0.0+rocm5.4.2, 2.0.1, 2.0.1+cpu, 2.0.1+cpu.cxx11.abi, 2.0.1+cu117, 2.0.1+cu117.with.pypi.cudnn, 2.0.1+cu118, 2.0.1+rocm5.3, 2.0.1+rocm5.4.2) ERROR: No matching distribution found for torch==1.5.1+cpu
Hey Abhishek this is related to more your dependencies
On Tue, 4 Jul, 2023, 11:47 Abhishek G, @.***> wrote:
Hi Folks I admire the work of @iiLaurens https://github.com/iiLaurens and appreciate the team. However I'm find error as below. Please I request any of you to resolve this issues it would be highly be appreciated. As I'm using colab notebook with cpu ERROR: Could not find a version that satisfies the requirement torch==1.5.1+cpu (from versions: 1.11.0, 1.11.0+cpu, 1.11.0+cu102, 1.11.0+cu113, 1.11.0+cu115, 1.11.0+rocm4.3.1, 1.11.0+rocm4.5.2, 1.12.0, 1.12.0+cpu, 1.12.0+cu102, 1.12.0+cu113, 1.12.0+cu116, 1.12.0+rocm5.0, 1.12.0+rocm5.1.1, 1.12.1, 1.12.1+cpu, 1.12.1+cu102, 1.12.1+cu113, 1.12.1+cu116, 1.12.1+rocm5.0, 1.12.1+rocm5.1.1, 1.13.0, 1.13.0+cpu, 1.13.0+cu116, 1.13.0+cu117, 1.13.0+cu117.with.pypi.cudnn, 1.13.0+rocm5.1.1, 1.13.0+rocm5.2, 1.13.1, 1.13.1+cpu, 1.13.1+cu116, 1.13.1+cu117, 1.13.1+cu117.with.pypi.cudnn, 1.13.1+rocm5.1.1, 1.13.1+rocm5.2, 2.0.0, 2.0.0+cpu, 2.0.0+cpu.cxx11.abi, 2.0.0+cu117, 2.0.0+cu117.with.pypi.cudnn, 2.0.0+cu118, 2.0.0+rocm5.3, 2.0.0+rocm5.4.2, 2.0.1, 2.0.1+cpu, 2.0.1+cpu.cxx11.abi, 2.0.1+cu117, 2.0.1+cu117.with.pypi.cudnn, 2.0.1+cu118, 2.0.1+rocm5.3, 2.0.1+rocm5.4.2) ERROR: No matching distribution found for torch==1.5.1+cpu
— Reply to this email directly, view it on GitHub https://github.com/DevashishPrasad/CascadeTabNet/issues/71#issuecomment-1619566744, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMJEFL6UHR7RW5ASAA4TZ6LXOOYPDANCNFSM4QFKPUUA . You are receiving this because you commented.Message ID: @.***>
Could you please elaborate my friend
Please elaborate more
Hi Guys,
First of all, thank you so much for sharing this amazing work. I have run the demo colab and got a good result.
To confirm, to run interference, cuda-enabled GPU is a must?
As #34 mentioned, do you consider to ease the dependency on GPU? This could make the model more scalable.
Thanks again. Luke