[higher version] Errors when I run the train.py with pytorch 1.7.0

microsoft / singleshotpose

This research project implements a real-time object detection and pose estimation method as described in the paper, Tekin et al. "Real-Time Seamless Single Shot 6D Object Pose Prediction", CVPR 2018. (https://arxiv.org/abs/1711.08848).

MIT License

717 stars 215 forks source link

[higher version] Errors when I run the train.py with pytorch 1.7.0 #163

Open yueqin-li opened 3 years ago

yueqin-li commented 3 years ago

Hi, I came across a problem when I tried the code with higher version pytorch. I wonder if it is caused by the version change. The error is as follows:

2021-08-05 17:33:32 epoch 0, processed 0 samples, lr 0.000100 8: nGT 8, recall 0, proposals 1349, loss: x 43.780529, y 60.316505, conf 203.617203, total 104.097031 Traceback (most recent call last): File "train.py", line 395, in niter = train(epoch) File "train.py", line 105, in train loss.backward() AttributeError: 'numpy.float32' object has no attribute 'backward'

nickhward commented 3 years ago

If you didn't mess with the base code, then I would think it was because of the increase in torch version. It might save you some time just to downgrade to torch 0.4.1 and make sure you have a cuda version that is compatible with that torch version (CUDA 9.0 for example).

nickhward commented 3 years ago

But if you do figure out how to fix this issue with 1.7.0 , let me know how you solved it as I will soon be attempting to use pytorch 1.7.0 as well. But if I run into this problem I'll make sure to provide a fix if you haven't already. Thank you.

yueqin-li commented 3 years ago

If you didn't mess with the base code, then I would think it was because of the increase in torch version. It might save you some time just to downgrade to torch 0.4.1 and make sure you have a cuda version that is compatible with that torch version (CUDA 9.0 for example).

Thanks for your reply. Your suggestion should be a direct and safe way. However, since my cuda version is too high for torch 0.4.1, I'd better change some codes instead of configuring the environment too much. Now I changed several lines in "region_loss.py" file and fortunately it worked. I think the problem is caused by the data type changes when the torch version is different.

nickhward commented 3 years ago

Which lines in "region_loss.py" did you end up changing if you don't mind me asking.

yueqin-li commented 3 years ago

But if you do figure out how to fix this issue with 1.7.0 , let me know how you solved it as I will soon be attempting to use pytorch 1.7.0 as well. But if I run into this problem I'll make sure to provide a fix if you haven't already. Thank you.

I attached my changes here for your inference. I would be happy to know if this also work for you and then I can confirm it is the right solution. debug_of_region_loss

nickhward commented 3 years ago

I'm trying to use torch 1.7.0 now. I haven't reached that error yet. But at the moment I'm getting this error:

File "train.py", line 403, in niter = train(epoch) File "train.py", line 104, in train loss = region_loss(output, target, epoch) File "/home/nicholasward2/anaconda3/envs/my_yolo/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/nicholasward2/YOLOR-POSE/region_loss.py", line 99, in forward nB = output.data.size(0) AttributeError: 'list' object has no attribute 'data'

I was curious if you ran into this error too?

yueqin-li commented 3 years ago

I'm trying to use torch 1.7.0 now. I haven't reached that error yet. But at the moment I'm getting this error:

File "train.py", line 403, in niter = train(epoch) File "train.py", line 104, in train loss = region_loss(output, target, epoch) File "/home/nicholasward2/anaconda3/envs/my_yolo/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/nicholasward2/YOLOR-POSE/region_loss.py", line 99, in forward nB = output.data.size(0) AttributeError: 'list' object has no attribute 'data'

I was curious if you ran into this error too?

Sorry, but I didn't.

JY-JOKE commented 1 year ago

in train.py there is an error about the version of pytorch and numpy corners2D_gt = np.array(np.reshape(box_gt[:num_keypoints2], [num_keypoints, 2]), dtype='float32') corners2D_pr = np.array(np.reshape(box_pr[:num_keypoints2], [num_keypoints, 2]), dtype='float32') do you know haw to fix those code? @yueqin-li