duanzhiihao / RAPiD

RAPiD: Rotation-Aware People Detection in Overhead Fisheye Images (CVPR 2020 Workshops)
http://vip.bu.edu/rapid/
Other
212 stars 63 forks source link

assert torch.cuda.is_available() ! it will be work without GPU? #20

Open RedOne88 opened 3 years ago

RedOne88 commented 3 years ago

Hi, and thanks for your code. I have a question, can your code work without GPU, with CPU! I can't seem to used the code, with this current version! I have always this error :

    assert torch.cuda.is_available()
AssertionError

Regarding the image basis, your code is able to run on fisheye images in grayscale and not in color. thank you in advance for your reply !

duanzhiihao commented 3 years ago

Hi, thank you for your interest. Actually, CUDA is not required to run RAPiD. I updated the api.py file, and could you check if you can run on CPU when passing use_cuda=False argument to the Detector class.

For training, I do not intend to add support for CPU since training on CPU will be super slow and I guess no one will try to do it.

Regarding the image basis, your code is able to run on fisheye images in grayscale and not in color.

I'm afraid I can't understand it. I believe our code runs on colored images. Could you give the error message if you are facing an error?

RedOne88 commented 3 years ago

Thank you for your reply. I was able to find your results on your test images. Thank you so much. However, I tried to test it on gray images, it didn't work. Here is the error:

File "example.py", line 8, in <module>
    detector.detect_one (img_path = '. / images / image1-002.jpg',
  File "/home/redmou/Téléchargements/rapid/api.py", line 69, in detect_one
    detections = self._predict_pil (img, ** kwargs)
  File "/home/redmou/Téléchargements/rapid/api.py", line 136, in _predict_pil
    dts = self.model (input _). cpu ()
  File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward (* input, ** kwargs)
  File "/home/redmou/Téléchargements/rapid/models/rapid.py", line 71, in forward
    small, medium, large = self.backbone (x)
  File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward (* input, ** kwargs)
  File "/home/redmou/Téléchargements/rapid/models/backbones.py", line 80, in forward
    x = self.netlist [i] (x)
  File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward (* input, ** kwargs)
  File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module (input)
  File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward (* input, ** kwargs)
  File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 399, in forward
    return self._conv_forward (input, self.weight, self.bias)
  File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 395, in _conv_forward
    return F.conv2d (input, weight, bias, self.stride,
RuntimeError: Given groups = 1, weight of size [32, 3, 3, 3], expected input [1, 1, 1024, 1024] to have 3 channels, but got 1 channels instead 

do you have any idea about this type of error? Thank you

duanzhiihao commented 3 years ago

Hello, Our method is not designed for gray images, but here is a workaround: we expand (repeat) the gray image to an RGB image before feeding it into the CNN:

# im is a torch tensor, and im.shape is (1,1,h,w)
im = im.expand(-1,3,-1,-1)
pred = model(im)
RedOne88 commented 3 years ago

Hello, and thank you for all your answers. Could you help me please? I managed to install a graphics card with a size 1g gpu. I started training, but considering the size of my gpu, it didn't want to start training, because part of the gpu is used to install pytorch.

Could you help me to train either in CPU (although if it will spend a lot of time) or with my actual gpu? thank you very munch.

duanzhiihao commented 3 years ago

Hi,

Given that your GPU has only 1G memory, it would be challenging to fit the model into the memory. I recommend you try to use Google Colab Notebook, which gives you a free 4GB GPU. Please check the tutorial here for Google Colab Notebooks.

Alternatively, you can use Kaggle notebooks, which sometimes provide you a free P100 GPU, which is powerful enough to train RAPiD.

Training on the CPU could cost more than 20 days for COCO and fisheye datasets. If you want to do it anyway, please let me know and I can provide a CPU training script in several days.

RedOne88 commented 3 years ago

thank you so much for your response. I will test what you have offered for me. I have a last question, considering the incredible work you have done, I have asked so many questions, I am so sorry. Just to train gray images (not in colors) it will be enough just to convert them in gray or change the code squarely? My idea is to train the gray images and try to optimize the final model as much as possible (after training), because the one provided is very large (246mb), with just training with gray images it can reduce the size of the model (I think). Could you tell me the piece of code to modify, if it is not complicated! thank you so much

duanzhiihao commented 3 years ago

No problem at all.

If you want to train using gray images, you need to modify the code here https://github.com/duanzhiihao/RAPiD/blob/e56ac87b0422d98f2942dbfbc64b745a3a3149ae/models/backbones.py#L47 to ConvBnLeaky(1,32,k=3,s=1).

However, using gray instead of RGB almost do not reduce the model size because it only affects the very first layer, which is very lightweight compared to the whole model.

An effective way to reduce model size (but sacrificing some accuracy) is to use half precision (ie, float 16) instead of float32. To do this, please try

model = model.half()
x = x.half()
y = model(x)
RedOne88 commented 3 years ago

thank you ! :)

RedOne88 commented 3 years ago

I looked at the two possibilities (google colab and kaggle), they are very very interesting.To test your code, using the GPUs provided, the execution is very fast. On the other hand, for training, there is the problem of disk space provided by the two platforms in order to load the trainer2017 images which has 18gb. I don't see how I can load it in order to start training.


De : Zhihao Duan @.> Envoyé : vendredi 4 juin 2021 15:27 À : duanzhiihao/RAPiD @.> Cc : RedOne88 @.>; Author @.> Objet : Re: [duanzhiihao/RAPiD] assert torch.cuda.is_available() ! it will be work without GPU? (#20)

No problem at all.

If you want to train using gray images, you need to modify the code here https://github.com/duanzhiihao/RAPiD/blob/e56ac87b0422d98f2942dbfbc64b745a3a3149ae/models/backbones.py#L47 to ConvBnLeaky(1,32,k=3,s=1).

However, using gray instead of RGB almost do not reduce the model size because it only affects the very first layer, which is very lightweight compared to the whole model.

An effective way to reduce model size (but sacrificing some accuracy) is to use half precision (ie, float 16) instead of float32. To do this, please try

model = model.half() x = x.half() y = model(x)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/duanzhiihao/RAPiD/issues/20#issuecomment-854814074, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHLGUS5DXZJHTUW7SOHNDD3TRDWEZANCNFSM433ZD57A.

RedOne88 commented 3 years ago

Hello,I was able to solve this problem. The Coco image database is already online.


De : Red Moub @.> Envoyé : mercredi 9 juin 2021 08:43 À : duanzhiihao/RAPiD @.> Objet : RE: [duanzhiihao/RAPiD] assert torch.cuda.is_available() ! it will be work without GPU? (#20)

I looked at the two possibilities (google colab and kaggle), they are very very interesting.To test your code, using the GPUs provided, the execution is very fast. On the other hand, for training, there is the problem of disk space provided by the two platforms in order to load the trainer2017 images which has 18gb. I don't see how I can load it in order to start training.


De : Zhihao Duan @.> Envoyé : vendredi 4 juin 2021 15:27 À : duanzhiihao/RAPiD @.> Cc : RedOne88 @.>; Author @.> Objet : Re: [duanzhiihao/RAPiD] assert torch.cuda.is_available() ! it will be work without GPU? (#20)

No problem at all.

If you want to train using gray images, you need to modify the code here https://github.com/duanzhiihao/RAPiD/blob/e56ac87b0422d98f2942dbfbc64b745a3a3149ae/models/backbones.py#L47 to ConvBnLeaky(1,32,k=3,s=1).

However, using gray instead of RGB almost do not reduce the model size because it only affects the very first layer, which is very lightweight compared to the whole model.

An effective way to reduce model size (but sacrificing some accuracy) is to use half precision (ie, float 16) instead of float32. To do this, please try

model = model.half() x = x.half() y = model(x)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/duanzhiihao/RAPiD/issues/20#issuecomment-854814074, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHLGUS5DXZJHTUW7SOHNDD3TRDWEZANCNFSM433ZD57A.

RedOne88 commented 3 years ago

On the other hand, we run the program on kaggle and activate the GPU, it worked at the beginning, then it spits with this error:

effective batch size = 8 * 16
initialing dataloader...
Only train on person images and objects
Loading annotations /kaggle/input/cocods/annotations_trainval2017/annotations/instances_train2017.json into memory...
Training on perspective images; adding angle to BBs
Using backbone Darknet-53. Loading ImageNet weights....
Warning: no ImageNet-pretrained weights found. Please check https://github.com/duanzhiihao/RAPiD for it.
Number of parameters in backbone: 40584928
2021-06-11 12:13:48.188692: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
/opt/conda/conda-bld/pytorch_1603729047590/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [0,0,0], thread: [0,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Traceback (most recent call last):
  File "/kaggle/input/rapid-training/train.py", line 257, in <module>
    loss = model(imgs, targets, labels_cats=cats)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/kaggle/input/rapid-training/rapid/rapid/models/rapid.py", line 80, in forward
    boxes_M, loss_M = self.pred_M(detect_M, self.img_size, labels)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/kaggle/input/rapid-training/rapid/rapid/models/rapid.py", line 282, in forward
    target[b,best_n,truth_j,truth_i,0] = tx_all[b,:n][valid_mask] - tx_all[b,:n][valid_mask].floor()
RuntimeError: CUDA error: device-side assert triggered

I believe there is an overflow of the size of the boxes! but I don't know where it comes from?

duanzhiihao commented 3 years ago

It seems to be related to the COCO dataset format. Please check #11 to see if that solves your problem.

RedOne88 commented 3 years ago

effectively it worked, but after 2 hours of training, it spat! here is the error message:

Total time: 1:52:05.342283, iter: 0:00:13.397096, epoch: 3:26:20.508855
[Iteration 500] [learning rate 0.001] [Total loss 209.47] [img size 512]
level_16 total 8 objects: xy/gt 1.385, wh/gt 0.143, angle/gt 0.627, conf 44.588
level_32 total 1 objects: xy/gt 1.340, wh/gt 0.016, angle/gt 0.638, conf 13.681
level_64 total 12 objects: xy/gt 1.384, wh/gt 0.215, angle/gt 0.748, conf 105.670
Max GPU memory usage: 6.040322303771973 GigaBytes

Traceback (most recent call last):
  File "/kaggle/input/rapid-training/train.py", line 303, in <module>
    dts = api.detect_once(model, eval_img, conf_thres=0.1, input_size=target_size)
  File "/kaggle/input/rapid-training/rapid/rapid/api.py", line 175, in detect_once
    dts = model(input_img[None]).cpu().squeeze()
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/kaggle/input/rapid-training/rapid/rapid/models/rapid.py", line 71, in forward
    small, medium, large = self.backbone(x)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/kaggle/input/rapid-training/rapid/rapid/models/backbones.py", line 80, in forward
    x = self.netlist[i](x)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 423, in forward
    return self._conv_forward(input, self.weight)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected input[1, 1, 608, 608] to have 3 channels, but got 1 channels instead

did you have any idea on the source of the error ! thank you very match.

duanzhiihao commented 3 years ago

The error tells that the input is a gray-scale image, but the network expects an RGB image. Did you made any change to the datasets.py script? https://github.com/duanzhiihao/RAPiD/blob/e56ac87b0422d98f2942dbfbc64b745a3a3149ae/datasets.py#L136

RedOne88 commented 3 years ago

Regarding the training with COCO, I did not manage to finish it, on kaggle each time the site crashes after 25 hours of execution. My idea is to build my own model with your algorithm. First, I will start with the existing one, I downloaded the HABOOF, CEPDOF, and MW-R, I want to build an images basis that brings together all its bases by taking for example 1000 images of each base. Do you think that we could have good results while training the model on an images basis that is not large enough?

duanzhiihao commented 3 years ago

Do you think that we could have good results while training the model on an images basis that is not large enough?

Yes, as long as you start from the pre-trained model and use a small learning rate.

RedOne88 commented 3 years ago

Hello, I still can't finish training. On kaggle, I have the right to 9 hours of continuous use. I chose the MW-R image basis, I reduced the size of the images and the annotations in order to reduce, perhaps, the training time. Otherwise, I have a 32 processor machine, with old graphics card (Quadro K4200) which is not supported by PyTorch, so I plan to run the training in CPU, could you provide me the version of your code compatible with CPU plz?

RedOne88 commented 3 years ago

Please keep me informed if you have any idea

RedOne88 commented 3 years ago

Hello Sir, can you provide me a CPU training script thanks