sherwinbahmani / cc3d

CC3D: Layout-Conditioned Generation of Compositional 3D Scenes
https://sherwinbahmani.github.io/cc3d
96 stars 3 forks source link

Question about KITTI-360 dataset #9

Closed Mu-Yanchen closed 7 months ago

Mu-Yanchen commented 7 months ago

Hello, thanks for your great work!

When I want to download the kitti dataset in https://www.cvlibs.net/datasets/kitti-360/download.php, I found there are so many choice including 2D(like Fisheye Images (355G), Fisheye Calibration Images (11G) and so on) and 3D. I want to know which one you use in your paper. By the way, after I download the Kitti dataset, is it recommended to use this tool https://github.com/QhelDIV/kitti360_renderer to pre-process kitti data sets?

Looking forward to your early reply!

sherwinbahmani commented 7 months ago

Hi,

Yes, please follow the readme in https://github.com/QhelDIV/kitti360_renderer for creating the kitti dataset.

There it mentions to use data_2d_raw, data_poses 3d_bboxes_full. So you should download these and then process them with the scripts there. There is also a download script in that repository.

https://github.com/QhelDIV/kitti360_renderer/blob/main/download.py

Mu-Yanchen commented 7 months ago

Thank you for your quick response and instructive suggestions! I will first try to download and preprocess the kitti dataset according to the methods in this repository. Thank you again for your kind reply!

Mu-Yanchen commented 7 months ago

Hello, Sorry for interupting you again When I use the tool https://github.com/QhelDIV/kitti360_renderer, I found the xgutils in this repository is no longer available. I will appreciate if you can provide some alternative way or fix so that later people who are interested in this wonderful work can process the data set to replicate the work.

sherwinbahmani commented 7 months ago

I guess the issue is being solved here: https://github.com/QhelDIV/kitti360_renderer/issues/1

Mu-Yanchen commented 7 months ago

Thank you for your quick response and I can solve the above problems now! By the way, can you share the cfg needed for the train.sh for kitti dataset which may be missing in the code

sherwinbahmani commented 7 months ago

It is not missing, just comment in following line instead of 3dfront: https://github.com/sherwinbahmani/cc3d/blob/62120dd131395362f0b5d955552ab80c245f0fee/train.sh#L2

It your path of the dataset points to the correctly processed kitti dataset, it should work right away

Mu-Yanchen commented 7 months ago

Thank you for your reply! I have noticed the commented # dataset_name=kitti, may I ask if the parameter cfg needs to be modified according to kitti, or --cfg=3dfront_2d_volume can also apply to kitti

sherwinbahmani commented 7 months ago

You don‘t have to change any other parameters

Mu-Yanchen commented 7 months ago

Ok, thank you so much for your prompt reply! I will close this issue, thanks again!

Mu-Yanchen commented 7 months ago

Sorry to bother you again:

  1. I found when I use https://github.com/QhelDIV/kitti360_renderer python kitti360_processor.py to preprocess kitti dataset using its default setting without modification, generates dimension inconsistent data sets which I mean I can get the data structure like:

    output/kitti360_v1_512/ 
        images/
        2013_05_28_drive_0000_sync_00000000/
                    0000.png
                2013_05_28_drive_0000_sync_00000001/...
                    0000.png
        labels/
        2013_05_28_drive_0000_sync_00000000/ 
                    boxes.npz
                2013_05_28_drive_0000_sync_00000001/...
                    boxes.npz
    

    but the image 0000.png's resolution is 256×256, but layout resolution in boxes.npz is 512×512 because the python kitti360_processor.py passing sematic_resolution like this https://github.com/QhelDIV/kitti360_renderer/blob/main/kitti360_processor.py#L714. Because this mismatch, there is a bug in the training process: https://github.com/sherwinbahmani/cc3d/blob/master/training/training_loop.py#L309. I don't know is there something wrong with me.

  2. I noticed that you have discussed #8 that "we discard scenes where the car is turning either left of right". I dont' know how can I discard these pics because I also generate 70k+ pics like #8.

sherwinbahmani commented 7 months ago

What is the exact error message you get? Semantic resolution and image resolution do not have to match, the semantic resolution gets downsampled default to 128 anyway

Mu-Yanchen commented 7 months ago

Thanks for your relay! Because of the below bug I mistakenly thought that these two dimensions needed to be aligned to be concatenate

Exception has occurred: ValueError
all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 512 and the array at index 1 has size 256
  File "/scratch/cc3d/training/training_loop.py", line 309, in 
    eval_real_imgs = np.concatenate([np.concatenate(img_seed, axis=2) for img_seed in eval_real_imgs], axis=1)
  File "/scratch/cc3d/training/training_loop.py", line 309, in training_loop
    eval_real_imgs = np.concatenate([np.concatenate(img_seed, axis=2) for img_seed in eval_real_imgs], axis=1)
  File "/scratch/cc3d/train_modified.py", line 53, in subprocess_fn
    training_loop.training_loop(rank=rank, **c)
  File "/scratch/cc3d/train_modified.py", line 102, in launch_training
    subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
  File "/scratch/cc3d/train_modified.py", line 637, in main
    launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
  File "/scratch/cc3d/train_modified.py", line 642, in 
    main() # pylint: disable=no-value-for-parameter
ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 512 and the array at index 1 has size 256
sherwinbahmani commented 7 months ago

Can you open an issue/make a pull request in the KITTI rendering for that? For this visualization it requires the same resolution of both, so best is to render the dataset at 256 and the semantics also, so maybe you can edit it to 256 for preprocessing. Or adjust the visualization. Sorry for that

Mu-Yanchen commented 7 months ago

Thank you, if I don't understand error, I need to change the https://github.com/QhelDIV/kitti360_renderer/blob/main/kitti360_processor.py#L714, To pass a semantic_resolution of 256 instead of 512. Or adjust this visualization to concat correctly. I will open an issue or make a pull request in the KITTI rendering for that when I have solved the bugs in the code perfectly. Besides, I would appreciate it if you could tell me how to solve the second problem mentioned above, which is how "we discard scenes where the car is turning either left of right". Thanks again for your prompt reply!

sherwinbahmani commented 7 months ago

It seems like code is missing to apply the filters properly, I also can't find it somehow. Can you open an issue there please? https://github.com/QhelDIV/kitti360_renderer/blob/main/kitti360_dataset.py There is some filtering here, but I don't see where the filtering is happening of the dataset

Mu-Yanchen commented 7 months ago

Thank you for your prompt reply!

I have just reviewed this file and found that it is exactly what you said: cant' see where the filtering is happening of the dataset. But because of my limited knowledge of the subject, I'm not quite sure what issue to ask the kitti360_render author. I think it may have something to do with the update of this tool, can you provide the kitti360_render version when you processed this data set, I think if I can use that version I should be able to process the expected data set just like you.

Best wishes

sherwinbahmani commented 7 months ago

I let Xingguang (the creator of https://github.com/QhelDIV/kitti360_renderer) know about this issue and he will look into it. Hopefully this can be fixed soon! Sorry for this issue.

Mu-Yanchen commented 7 months ago

Thanks for your generous help and timely reply. I will follow up the improvement of kitti360_renderer🫡

QhelDIV commented 7 months ago

Hi @Mu-Yanchen, thanks for your attention at this project. I have updated the kitti360_render repo by adding an example code main.py to show the filtering process. Basically it examines the camera pos and others for each frame in each sequence and flag it with a boolean to indicate whether to keep it or not.

After that it will copy the kept data items into a new directoy to form the filtered dataset.

Please post here or at the kitti360_render repo if you have additional questions after trying the main.py.

Mu-Yanchen commented 7 months ago

Hi @QhelDIV, thanks for your prompt reply. I will immediately use the updated one and find out if there are any other questions. Thank you again for your work!