Open shahabty opened 6 years ago
@shahabty Hi, I'm having the same problem. After 100 epochs, the network almost learned nothing. Have you found a way to solve it? Thanks.
Hello,
Sorry I don’t have access to my computer now, I will upload the weight on 19th :)
@ChiWeiHsiao Thanks for your reply! Looking forward to it.
Btw, could you kindly share some tricks for the training? I also want to reproduce the results, but my network almost learned nothing. Actually, I'm quite confused how long the training sequence should be and whether we need to first train the network with a short sequence then increase the length and finetune the weights. I doubt the LSTM trained with a limited length can perform well on a longer data sequence.
Thanks a lot!
@Yusufma03 I still have the problem with performance. I think pretrained weights that @ChiWeiHsiao will upload are helpful. Also, I appreciate it if @ChiWeiHsiao explains how the network is trained (especially data pre-processing, batch size, trajectory length, learning rate policy, which pretrained flownet must be loaded used).
@shahabty Hi, I just managed to train the network to make some at least "reasonable" predictions by adjusting the ratio between the angle loss and translation loss. I increased the weight for the angle loss according to the epochs. The other thing is to train it for more epochs and use a large image size. However, the overfitting is still a problem. The performance of the trained network on the validation set is still not good enough. It would be nice if @ChiWeiHsiao could give me some suggestions on how to handle this problem. Thanks a lot!
Hi @shahabty @Yusufma03, The trained weights can be downloaded here. This is trained for about 400 epochs, but it also suffer from overfitting. :( The author of DeepVO paper seems to have good result, while the values of some hyperparameters are not stated in paper. BTW, one difference between mine and author's setting is the size of input image, I shrink the size to 1/4 due to memory limitation, not sure if this has great impact on performance.
Hello, I think the main problem is pre processing which must be done correctly to prevent overfitting. The number of images in KITTI is not enough for training. So they might have done some transformation on inputs to generate more input data. Also, another problem might be the length of trajectory which might not be 10.
Hi~ o( ̄▽ ̄)ブ @ChiWeiHsiao Sorry to bother you! Recently I trained your net about 250 epochs.The loss value reduced to about 0.67.However the results i got were so terrible.So i'd like to load your trained weights which was trained about 400 epochs . But i came across the problem when i loaded it. It's said that the different pytorch version made it . But i also trianed with torch version 0.4.0. The problem confused me a lot! 😭 Are your results totally got from codes in this repo without any modification? If so, I won't train them .😂 The following pic was one of my resullts. The following pic was the problem when i loaded your trained weights . Can you give me some suggestions? QAQ
@yp233 I met the same problem,and I have solved it. You just need to modify that sentence as following: M_deepvo.load_state_dict(torch.load(par.load_model_path), strict=False)
hi,i can't downloads the the KITTI dataset when run the downloader.sh, it always stop. can you tell me some other way for downloading?
Hi @yuanjinsheng, The files are large, maybe the remaining space of your computer is not enough? Could you post your error message?
Hi, Please go to kitti website and register for the dataset. They will send the download link to you. There are different benchmarks of kitti. You probably only need odometry and raw kitti benchmarks.
hi @ChiWeiHsiao, the space is enough, it just stop after download a little , do you has other link?
@shahabty i cann't find the same dateset as the downloader.sh, how you do it ?
@ChiWeiHsiao thank you for your help.
Hello, guys. There's a parameter 'seq_len' in param.py. I want to know the what's the meaning of 'seq_len' . Why it's (5,7)? Should it be a single number, if it's the sequence length? @Yusufma03 @shahabty Thank you very much~
@sunnyHelen Hi, that's the sequence length used in training the LSTM, which is a hyperparameter you should tune. The author of DeepVO proposed to randomly segment the videos into subsequences of different lengths and said this is useful for reducing avoiding the overfitting. (5, 7) means the min length of the subsequence is 5 and the max is 7. However, the author didn't mention what length he used in the paper.
Ok. Thank you very much.
sorry to bother you~ I found that the file "data_helper.py" and "helper.py" were updated and changed. And in my case the file after changed can't work now and they could work before.
File "/project/RDS-FEI-HMZ-RW/Original_DeepVO-pytorch-master_lstm/data_helper.py", line 204, in getitem groundtruth_rotation = raw_groundtruth[1][0].reshape((3, 3)).T # opposite rotation of the first frame ValueError: cannot reshape array of size 0 into shape (3,3)
I want to ask what's the problem. Have you ever encounter this thing?@ @ChiWeiHsiao @Yusufma03 @shahabty
@sunnyHelen try to rerun preprocess.py to update the files in pose_GT folder. The changes you mentioned affected the structure of the files.
Oh, right. Thank you for your help~
Sorry to bother you again~ I run train.py using pretrained flownet "pretrained/flownets_bn_EPE2.459.pth.tar". I got the training loss like this. The training loss and valid loss is so small. I want to ask if it is normal?
@sunnyHelen It is normal.
sorry to bother you~ I found that the file "data_helper.py" and "helper.py" were updated and changed. And in my case the file after changed can't work now and they could work before.
File "/project/RDS-FEI-HMZ-RW/Original_DeepVO-pytorch-master_lstm/data_helper.py", line 204, in getitem groundtruth_rotation = raw_groundtruth[1][0].reshape((3, 3)).T # opposite rotation of the first frame ValueError: cannot reshape array of size 0 into shape (3,3)
I want to ask what's the problem. Have you ever encounter this thing?@ @ChiWeiHsiao @Yusufma03 @shahabty
when I run main.py occure above question
@sunnyHelen try to rerun preprocess.py to update the files in pose_GT folder. The changes you mentioned affected the structure of the files.
I try to rerun preprocess.py ,but this is not work ,can you help me
@kourong try to clean up 'datainfo' folder and run main.py again.
@kourong try to clean up 'datainfo' folder and run main.py again. thank you ,I try to it
Hi, sorry to bother you~ Is the way you use to calculate mse_rotate and mse_translate the same as the result in the deepVO paper? why I got much worse result compared with those in the paper?
@sunnyHelen It would be incorrect to directly compare mse_rotate and mse_translate with the metrics in the article because they are calculated differently. The authors of the article operated with RMSE translation in % relative to the distance and RMSE rotation in degrees per 100m.
Oh. I got it. Thanks a lot~
Hi~ o( ̄▽ ̄)ブ @ChiWeiHsiao Sorry to bother you! Recently I trained your net about 250 epochs.The loss value reduced to about 0.67.However the results i got were so terrible.So i'd like to load your trained weights which was trained about 400 epochs . But i came across the problem when i loaded it. It's said that the different pytorch version made it . But i also trianed with torch version 0.4.0. The problem confused me a lot! Are your results totally got from codes in this repo without any modification? If so, I won't train them . The following pic was one of my resullts. The following pic was the problem when i loaded your trained weights . Can you give me some suggestions? QAQ
how do you make the picture?
Hello, Can anyone tell me where can we get the path lengths and the speeds of the various sequences in the KITTI color dataset used? The authors have presented their analysis based on it but I could not find the lengths and speeds of the trajectories. I would want to verify their claims on my results as well.
@akshay-iyer The dataset contains groundtruth data. Thus you can calculate the length by summing up the distance between each two consecutive points.
Many thanks @alexart13 I have a few more questions. Apologies if they are naive, I'm just trying to learn and grow in deep learning.
Thanks, Akshay
@akshay-iyer
Oh. I got it. Thanks a lot~
hello,I don't know how to cacluate RMSE loss ,can you tell me explicitly? Thanks
Hello, Can anyone tell me where can we get the path lengths and the speeds of the various sequences in the KITTI color dataset used? The authors have presented their analysis based on it but I could not find the lengths and speeds of the trajectories. I would want to verify their claims on my results as hello,have you found how to use to calculate mse_rotate and mse_translate the same as the result in the deepVO paper
Hello, Can anyone tell me where can we get the path lengths and the speeds of the various sequences in the KITTI color dataset used? The authors have presented their analysis based on it but I could not find the lengths and speeds of the trajectories. I would want to verify their claims on my results as well.
hello, have you found how to use to calculate mse_rotate and mse_translate the same as the result in the deepVO paper
Sorry to bother you again~ I run train.py using pretrained flownet "pretrained/flownets_bn_EPE2.459.pth.tar". I got the training loss like this. The training loss and valid loss is so small. I want to ask if it is normal?
Hi, sorry to bother you. I just want to know how you can got the result that the loss is so low. I used the weights of the model provided by alexart13, but only got 0.04 in train loss and 0.05 in valid loss. Maybe it's because I can not use the weights of the optimizer due to the vision of pytorch, but I see you just use pretrained flownet weights to get a better result than me, so I'm wondering if you can say some details about your training?(For example, what measures or what changes? )
@sunnyHelen Hi, I wonder how you were able to train the model in main.py
? When I ran the code I noticed it took over 30 mins to run a single episode... and I do have CUDA available since I have an NVIDIA GPU. I am not sure what's causing the slow runtime
Hello, Can you upload pretrained weights? Because I trained it for 72 epochs. But the testing loss is too high and the results are not visually convincing at all. Thanks