Closed sunhucheng closed 1 year ago
Training model named: lite-mono Models and tensorboard events files are saved to: ./tmp Training is using: cuda Using split: eigen_zhou There are 133 training items and 30 validation items
Training
epoch 0 | lr 0.000100 |lr_p 0.000100 | batch 0 | examples/s: 2.0 | loss: 0.15695 | time elapsed: 00h00m04s | time left: 00h00m00s
epoch 0 | lr 0.000100 |lr_p 0.000100 | batch 5 | examples/s: 25.6 | loss: 0.14992 | time elapsed: 00h00m08s | time left: 01h38m10s
epoch 0 | lr 0.000100 |lr_p 0.000100 | batch 10 | examples/s: 21.3 | loss: 0.15792 | time elapsed: 00h00m12s | time left: 01h11m14s
Traceback (most recent call last):
File "C:\Users\Mr-Sun\Desktop\lite_mono\Lite-Mono\train.py", line 12, in
When the above error message is reported, we can see that the program is still running for a while, but a similar error will be reported later
Hi,
why do you only have 133 training items and 30 validation items? If you use a subset of KITTI please ensure that your training lists only contain these files. Besides, I notice you are using .png
file to train. The default training uses .jpg
files and you need to convert .png
to .jpg
.
Hi I know what you mean, I do use the kitti data set, but my local hard drive is not enough, I want to run with a small data set first, and then I will use the full data set to run on the server later. so I only kept 2011_09_26/2011_09_26_drive_0001_sync and 2011_09_26/2011_09_26_drive_0002_sync,And I modified eigen_zhou, which only kept the contents of 2011_09_26/2011_09_26_drive_0001_sync and 2011_09_26/2011_09_26_drive_0002_sync. and the data directory format is the same as the one you sent me in the #41 issue I mentioned earlier, and in the 2011_09_26/2011_09_26_drive_0001_sync and 2011_09_26/2011_09_26_drive_0002_sync datasets I kept the data is exactly the same as the data in the complete dataset,107+76=133+30+20. 133 for training in eigen_zhou,30 for validation in eigen_zhou,20 for test in eigen. that is to say, in the complete dataset there are only 107 pictures in 2011_09_26/2011_09_26_drive_0001_sync , and only 76 pictures in 2011_09_26/2011_09_26_drive_0002_sync. So I don't think it's a problem with my dataset, is there something wrong with the code. Regarding the picture format problem you mentioned, I have set png as the default picture format in trainer.py, so this is not the problem. I don’t know if I made it clear. This problem is indeed a tricky one. It’s hard to solve without seeing the actual situation. I will continue to try.I will modify conda environment as you mentioned in #41 this issue
The error you have encountered is due to missing data. It is not related to the code.
Ok,I will run the code on the full dataset on the server,thank you for your reply
Hi ! noahzn: I'm bothering you again! I run train.py in pycharm .The program reported an error:FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\Mr-Sun\Desktop\lite_mono\Lite-Mono\kitti_data\2011_09_26/2011_09_26_drive_0002_sync\image_03/data\0000000077.png'
but kitti_data\2011_09_26/2011_09_26_drive_0002_sync\image_03/data just 76 images .
sometimes ,the program reported an error:FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\Mr-Sun\Desktop\lite_mono\Lite-Mono\kitti_data\2011_09_26/2011_09_26_drive_0001_sync\image_02/data\0000000108.png'
but kitti_data\2011_09_26/2011_09_26_drive_0001_sync\image_02/data just 107 images .
The errors is all the same , so I think this is a common question, have you ever been in this situation? Do you know the reason for the error?