Closed wasup07 closed 3 months ago
this is my dataset structure :
Hi, you can take a look at the KITTI data loader (https://github.com/noahzn/Lite-Mono/blob/main/datasets/kitti_dataset.py) and you need to write your own Nucenes data loader. You can choose a camera, and this argument decides which frames are loaded as the previous, current, next frame. If the sequential frames in your dataset are very close (i.e., the car moved very slowly), then you can try something like [0, -5, 5]
, and it means you choose the -5th frame, the current frame, and the 5th frame for training. Two or three splits are enough.
Thank you for your answer. Does this mean that I can only choose one side camera angle and that I can not train on six of them simultaneously?
And I have to divide my data set into train, val and test?
Yes, you can only choose one camera, as this is monocular self-supervised training.
A train
and a test
set should be enough. If you have more data, you can split an additional val
set.
Thank you very much for your time. Could I ask you to put this Issue on hold for 3 weeks or one month, as I may come back to you with further questions?
Ok, no problem.
Hello Noah,
I'd like to ask you a question, After re-reading your scripts (options.py, kitti_dataset.py, trainer.py) and checking the kitti raw dataset, I discovered that you trained your model on grayscale images as opposed to the nuscenes dataset (which I plan to use). So I'd like to check which files and folders are used: for the moment i know that the velodyne files are used but for the rest i'm not sure.
I really appreciate your quick response.
I've already seen the steps for splitting the data, but I'd like to know what file other than the images should be used so that I can determine, by comparing with the nuscenes dataset, what I should delete or add in the trainer.py file. Also, is the purpose of the Monodataset class just to convert images and augmentate them?
When you get the splits, the .txt files will show you all the images used in the training and val. Therefore, you can just check the files manually according to their paths. The mono_dataset.py
class is a base class used in the project, and any specific dataset class should be inherited from this base class so that it makes your life easier. For example, for your Nuscenes dataset, you can simply create a new file nuscenes_dataset.py
, and define the class as class NUSCENESDataset(MonoDataset)
. Then, you can write any specified data loading code in this class, and you don't need to change the base class.
Thanks for your quick reply,
I wrote my own Nuscenes_dataset file but before that, I checked the functionality of the model if I train on the kitti dataset: I used eigen_zhou to separate the val and train datasets and I didn't change anything else but I still have the following error that I couldn't solve with the debugger mode on vscode.
I think there's an error in this section, but I'm not sure:
insetad of features = self.models["encoder"](inputs["color_aug", 0, 0]) it should be features = self.models["encoder"](inputs[("color_aug", 0, 0)])
I think the error comes from the incompatibility of your PyTorch and CUDA version.
Yes, it seems that when I tried to update it, it didn't work, but when I searched for a specific version, it did. I have a couple of questions:
In the training phase, is it possible to shuffle images (those in left_back or right_back in the case of the nuscene dataset) regardless of chronological order? I create txt files like those in the eng_zhou folder for the nuscene dataset.
In my case, the camera intrinsec changes depending on the camera side. I suppose I need to change mono_dataset as the camera's intrinsec matrix is 3*3 unlike that of the Kitti model and add conditional use
What's more, my initial images are 900*1600: 1600 is a multiple of 32 but that's not the case for 900. I was thinking of adding resizing to the nuscenes dataset in the get_color function. Does this seem correct?
Another question: I think that as I'm not using a stereo camera, I can deactivate this option?
options.py
--use_stereo
then it's ok.Thank you for your quick reply.
There are a few points I don't understand.
Here is an example of file names in train_files.txt
Yes, the first image pairs are 473r, 472r, 474r (0, -1, 1). Please make sure that you define the frames correctly.
Yes, the first image pairs are 473r, 472r, 474r (0, -1, 1). Please make sure that you define the frames correctly: Sorry, perhaps I wasn't paying close enough attention, but I can't see how the images are matched (I can't see images 473r, 472r, 474r train_files.txt for Kitti_dataset).
Hello, thank you again for your quick reply!
I've managed to start training on the nuscenes dataset and I no longer have any problems with this dataset, but I do have a small problem and I'd like to have a suggestion from you
I haven't been able to use tensorboard because of the incompatibility of the pytorch version, which I can't change otherwise I won't be able to train lit-mono.
So do you have an idea for to follow the training without using tensorborad?
Hi, you can install tensorboardX.
I've tried but you need to update the pytorch version:
You can install using pip install tensorboardX
I've tried it and it seems to be a common tensorboard problem (I've had the same feedback from a colleague who uses tensorboard on a similar projects).
Or you can try wandb
.
In this case, I think it's important to modify the trainer.py file?
Yes, you need to add some lines of code.
Hello, thank you for all your help over the last couple of weeks, I wanted to ask you how I can generate the depth estimation images at the end and how to evaluate them.
https://github.com/noahzn/Lite-Mono/blob/main/evaluate_depth.py This file generates depth predictions and compares with the ground-truth.
Thank you again for your quick reply!
I would like to ask you where I can retrieve the following error metrics:
I I didn't find them in trainer.py file if I'm not mistaken.
Hi Noah,
I'd like to get your feedback on training Lite Mono on the Nuscenes dataset, I trained the model several times on a subset of the Nuscenes dataset just to check the model's effectiveness. The subset was about 1250 images of the central rear camera (no stereo camera). 1000 images for the training dataset and 250 images for the validation dataset. The problem here is that the training loss function always alternates between 0.06 and 0.08, so it's not stable and doesn't decrease, unlike what I noticed with the Kitti dataset. I tried fine-tuning the parameters in the --lr option but nothing changed. Attached are the mono_dataset file, the nuscenes dataset file and the options file. Could you give me your opinion on what could be the problem or what I should change in these files? I have not modified the trainer file I've added the Jason files for camera information. calibrated_sensor.json
sensor.json options_or.txt mono_dataset_nuscenes.txt nuscenes_dataset.txt
Thanks in advance
Hi, using 1000 images for training is not enough for this self-supervised method. Did you check the visualizations in the tensorboard?
I am now closing this issue as there is no further update.
Hello, I read your article and appreciate your work on lite-mono, I would like to train your model (using the pre-trained backbone) on the Nuscenes dataset. I have a slight problem: I don't understand from the article how the dataset should be split and what structure it should follow. Is it enough to have 3 folders: train, val and test, and is it not necessary to divide the images according to the angle at which they were taken? I plan to train the model as a self-supervised model.