SysCV / r3d3

BSD 3-Clause "New" or "Revised" License
144 stars 13 forks source link

The purpose of generating training data and why it's so slow? #11

Open zhanghm1995 opened 9 months ago

zhanghm1995 commented 9 months ago

Hi, thanks for opening this great work!

I wonder why the command to generate training data in step 1 is almost the same as the Evaluation process? And why the step for generating training data is so slow, and it will evaluate the metrics like the below for the nuScenes dataset? image

Are there any ideas to accelerate the training data generation step?

AronDiSc commented 9 months ago

Hi @zhanghm1995, thank you for your interest in our work.

Generating the data takes so long because we use a larger covisibility graph than the one used during inference to achieve slightly better pose estimates. This is specified through the --r3d3_graph_type=droid_slam parameter. Furthermore, the pre-computed cost volumes of the larger graph do not fit into (our) gpu-memory, thus we use the "memory efficient" approach (--r3d3_corr_impl=lowmem) as proposed by the authors of RAFT, which is a bit slower. To speed up the process you could also remove the two parameters and resort to the same covisibility graph used during inference.

The metrics are generated because Lidar information is available for NuScenes. We do not make use of those metrics during dataset generation. You could easily turn it off in the config files and use it for other datasets where you might not have Lidar available.

zhanghm1995 commented 9 months ago

Hi, @AronDiSc , thanks for your patient reply. It helps me a lot.

Since I'm not familiar with this framework, I want to know more details about this repo.

  1. If I remove the --r3d3_corr_impl=lowmem parameter and I have enough GPU memory to generate the data, could it cause performance degradation?

  2. If I don't want to train all training sets in the nuScenes dataset, I just try to overfit R3D3 in one scene, would it be reasonable to reproduce your results? I have already done this experiment in scene-1094, I got the Loss and metric like below, I'm not sure it's ok or not. image

  3. How to visualize the inference results like yours in your paper? image

  4. How to resume the checkpoint in the training stage.

Thanks in advance.