SamsungLabs / imvoxelnet

[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
MIT License
284 stars 29 forks source link

How to run single view detection on ScanNet? #72

Open gyhandy opened 1 year ago

gyhandy commented 1 year ago

Hi, thank you for your excellent work! If I want to un single view detection on ScanNet, are there any suggestions on code modification? Thanks!

filaPro commented 1 year ago

Hi @gyhandy , You can just set n_images to 1 here.

gyhandy commented 1 year ago

Thank you for your reply! Can I train an ImvoxelNet using single view on ScanNet? Thanks!

filaPro commented 1 year ago

Setting n_images to 1 in training pipeline should also be ok.

gyhandy commented 1 year ago

Many thanks for your quick response! If I use single view on Scannet training, during ScanNet dataset preprocess, do I need step 3 here?

"3. In this directory, extract RGB image with poses by running python extract_posed_images.py. This step is optional. Skip it if you don't plan to use multi-view RGB images. Add --max-images-per-scene -1 to disable limiting number of images per scene. ScanNet scenes contain up to 5000+ frames per each. After extraction, all the .jpg images require 2 Tb disk space. The recommended 300 images per scene require less then 100 Gb. For example multi-view 3d detector ImVoxelNet samples 50 and 100 images per training and test scene."

filaPro commented 1 year ago

Yes, you need it.

gyhandy commented 1 year ago

Thanks again! Then which single view among the 300 pictures in the same scene will the model use to train and test?

filaPro commented 1 year ago

Just a random one for each train or test iteration.

gyhandy commented 1 year ago

For reproducible, is it possible to fix the frame for each scene? Thanks!

filaPro commented 1 year ago

You can add some workaround here.

gyhandy commented 1 year ago

I appreciate your help! I will try and update you here. Thanks!

gyhandy commented 1 year ago

Previously I use the MMdetection3d repo to run imvoxelnet experiment, but they do not support ScanNet. So I use this repo. But I found the same conda environment (I built for mmdetection3d) could not be directly used to run the code in this repo. For instance, why do you need to constrain the mmcv versions? mmcv_minimum_version = '1.1.5' mmcv_maximum_version = '1.3.0'

Another question is how to change the MMdetection3D code of imvoxelnet (currently supporting sunrgbd) to run the Scannet dataset? Thanks!

filaPro commented 1 year ago

This repo is ~3 years old, so mmdetection3d had several major releases since that time, having almost nothing common with version 0.8.0 we are using here. The only way to run this repo is to follow the version of all packages in our Dockerfile.

gyhandy commented 1 year ago

Thanks! I run the docker file (sudo docker build -t imvoxelnet .) and face the following errors:

ERROR [ 5/16] RUN pip install mmdet==2.10.0 11.6s


[ 5/16] RUN pip install mmdet==2.10.0:
0.852 Collecting mmdet==2.10.0
0.958 Downloading mmdet-2.10.0-py3-none-any.whl (547 kB)
1.096 Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from mmdet==2.10.0) (1.14.0)
1.137 Collecting terminaltables
1.149 Downloading terminaltables-3.1.10-py2.py3-none-any.whl (15 kB) 1.266 Collecting mmpycocotools 1.282 Downloading mmpycocotools-12.0.3.tar.gz (23 kB) 1.565 Requirement already satisfied: numpy in /opt/conda/lib/python3.7/site-packages (from mmdet==2.10.0) (1.18.1) 2.022 Collecting matplotlib 2.036 Downloading matplotlib-3.5.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.2 MB) 2.413 Requirement already satisfied: setuptools>=18.0 in /opt/conda/lib/python3.7/site-packages (from mmpycocotools->mmdet==2.10.0) (46.4.0.post20200518) 3.211 Collecting cython>=0.27.3 3.226 Downloading Cython-3.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.5 MB) 3.565 Collecting kiwisolver>=1.0.1 3.578 Downloading kiwisolver-1.4.4-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.1 MB) 3.731 Collecting pyparsing>=2.2.1 3.742 Downloading pyparsing-3.1.1-py3-none-any.whl (103 kB) 3.826 Collecting packaging>=20.0 3.837 Downloading packaging-23.1-py3-none-any.whl (48 kB) 3.903 Collecting cycler>=0.10 3.916 Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB) 3.995 Collecting python-dateutil>=2.7 4.010 Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) 4.036 Requirement already satisfied: pillow>=6.2.0 in /opt/conda/lib/python3.7/site-packages (from matplotlib->mmdet==2.10.0) (7.1.2) 4.214 Collecting fonttools>=4.22.0 4.228 Downloading fonttools-4.38.0-py3-none-any.whl (965 kB) 4.347 Requirement already satisfied: typing-extensions; python_version < "3.8" in /opt/conda/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->mmdet==2.10.0) (4.7.1) 4.348 Building wheels for collected packages: mmpycocotools 4.349 Building wheel for mmpycocotools (setup.py): started 4.830 Building wheel for mmpycocotools (setup.py): finished with status 'error' 4.830 ERROR: Command errored out with exit status 1: ...... ...... Dockerfile:17 15 | # Install MMCV 16 | RUN pip install mmcv-full==1.2.7 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html 17 | >>> RUN pip install mmdet==2.10.0 18 |
19 | # Install MMDetection

ERROR: failed to solve: process "/bin/sh -c pip install mmdet==2.10.0" did not complete successfully: exit code: 1


Could you please help to provide a potential solution? Thanks!

filaPro commented 1 year ago

Can you try with RUN conda install cython before RUN pip install mmdet==2.10.0 and tell me if it helps?

gyhandy commented 1 year ago

Thanks for your reply! Will try and update here. Another question: if we do monocular detection on Scannet (you also show single view test result), but we only have ground truth object labels for the whole scene, how to know which objects in the scene are visible in the given single view? Thanks!

gyhandy commented 1 year ago

After adding RUN conda install cython, the docker can be installed, thanks! But I face a new error when I run the "bash tools/dist_train.sh configs/imvoxelnet/imvoxelnet_sunrgbd_fast.py 1" it looks like a CUDA or pytorch error, here are the details. Do you have recommended GPU card? Thanks!

Traceback (most recent call last): File "tools/train.py", line 166, in main() File "tools/train.py", line 162, in main meta=meta) File "/opt/conda/lib/python3.7/site-packages/mmdet/apis/train.py", line 82, in train_detector find_unused_parameters=find_unused_parameters) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 333, in init self.broadcast_bucket_size) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 549, in _distributed_broadcast_coalesced dist._broadcast_coalesced(self.process_group, tensors, buffer_size) RuntimeError: CUDA error: no kernel image is available for execution on the device Traceback (most recent call last): File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in main() File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main cmd=cmd) subprocess.CalledProcessError: Command '['/opt/conda/bin/python3', '-u', 'tools/train.py', '--local_rank=0', 'configs/imvoxelnet/imvoxelnet_sunrgbd_fast.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

filaPro commented 1 year ago

how to know which objects in the scene are visible in the given single view?

Yes, that it is not trivial and probably requires also extracting not only RGB but a depth image to check the occlusions. That's why we recommend to train single-view detection only on SUN RGB-D.

RuntimeError: CUDA error: no kernel image is available for execution on the device

That is not probably the bug of our code. Did you check that pytorch can for example create a single tensor in this docker image on your hardware?

gyhandy commented 1 year ago

Thank you for your reply! Do you have recommendations on other datasets (except SUN RGBD) that can run indoor single-view detection? Also, I am writing a code to run single view/multi view detection on ScanNet in MMdetection3D's repo (currently, MMdetection3D only supports single view on SUN RGBD, and does not support multiview on ScanNet), could you please help to refine it? Or do you have recommended strategy to modify the current code to run in the MMdetection official code? Thanks!

gyhandy commented 1 year ago

Yes, it can create the tensor with pytorch.

But when we use the GPU, it still shows different errors: one thing, when we build the docker image, we change "RUN pip install mmcv-full==1.2.7+torch1.6.0+cu101 -f https://openmmlab.oss-accelerate.aliyuncs.com/mmcv/dist/index.html" to "RUN pip install mmcv-full==1.2.7'

Because there is an error said can not find version of "mmcv-full==1.2.7+torch1.6.0+cu101 -f https://openmmlab.oss-accelerate.aliyuncs.com/mmcv/dist/index.html"

Do you think this is the reason that cause the error? We tried both 3090 and 1080Ti, all does not work.

Here is the error in 1080Ti

RuntimeError: unable to write to file
/mmdetection3d/mmdet3d/models/dense_heads/imvoxel_head_v2.py:172: UserWarning: This overload of nonzero is deprecated: nonzero(Tensor input, , Tensor out)
Consider using one of the following signatures instead:
nonzero(Tensor input,
, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.) flatten_valids
Exception ignored in: <function _MultiProcessingDataLoaderIter.del at 0x7fb5d6811170>
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1101, in del
self._shutdown_workers()
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1075, in _shutdown_workers w.join(timeout=_utils.MP_STATUS_CHECK_INTERVAL)
File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 140, in join
res = self._popen.wait(timeout)
File "/opt/conda/lib/python3.7/multiprocessing/popen_fork.py", line 45, in wait
if not wait([self.sentinel], timeout):
File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 920, in wait
ready = selector.select(timeout) File "/opt/conda/lib/python3.7/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 727) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shar ed memory limit.
Traceback (most recent call last):
File "tools/train.py", line 166, in
main()
File "tools/train.py", line 162, in main
meta=meta)
File "/opt/conda/lib/python3.7/site-packages/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], kwargs)
File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True)
File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
kwargs)
File "/opt/conda/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 46, in train_step
output = self.module.train_step(*inputs[0], kwargs[0])
File "/opt/conda/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 247, in train_step
losses = self(
data)

Here is the error in 3090:

Traceback (most recent call last): File "tools/train.py", line 166, in main() File "tools/train.py", line 162, in main meta=meta) File "/opt/conda/lib/python3.7/site-packages/mmdet/apis/train.py", line 82, in train_detector find_unused_parameters=find_unused_parameters) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 333, in init self.broadcast_bucket_size) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 549, in _distributed_broadcast_coalesced dist._broadcast_coalesced(self.process_group, tensors, buffer_size) RuntimeError: CUDA error: no kernel image is available for execution on the device Traceback (most recent call last): File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in main() File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main cmd=cmd) subprocess.CalledProcessError: Command '['/opt/conda/bin/python3', '-u', 'tools/train.py', '--local_rank=0', 'configs/imvoxelnet/imvoxelnet_sunrgbd_fast.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

filaPro commented 1 year ago

3090 should not work with cuda 10. For 1080 just increase the shared memory of your docker image e.g. to 16Gb.

gyhandy commented 1 year ago

Thanks for your reply, after increase the shared memory on 1080, there are new errors:

Traceback (most recent call last):
File "tools/train.py", line 166, in
main()
File "tools/train.py", line 162, in main
meta=meta)
File "/opt/conda/lib/python3.7/site-packages/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], kwargs) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter kwargs) File "/opt/conda/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 46, in train_step output = self.module.train_step(inputs[0], kwargs[0]) File "/opt/conda/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 247, in train_step losses = self(data) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, *kwargs) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func return old_func(args, kwargs) File "/opt/conda/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 181, in forward return self.forward_train(img, img_metas, kwargs) File "/mmdetection3d/mmdet3d/models/detectors/imvoxelnet.py", line 84, in forward_train losses = self.bbox_head.forward_train(x, valids.float(), img_metas, gt_bboxes_3d, gt_labels_3d) File "/mmdetection3d/mmdet3d/models/dense_heads/imvoxel_head_v2.py", line 62, in forward_train losses = self.loss(loss_inputs) File "/mmdetection3d/mmdet3d/models/dense_heads/imvoxel_head_v2.py", line 104, in loss gt_labels=gt_labels[i] File "/mmdetection3d/mmdet3d/models/dense_heads/imvoxel_head_v2.py", line 180, in _loss_single avg_factor=n_pos File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/mmdet/models/losses/focal_loss.py", line 177, in forwardavg_factor=avg_factor) File "/opt/conda/lib/python3.7/site-packages/mmdet/models/losses/focal_loss.py", line 86, in sigmoid_focal_loss 'none') File "/opt/conda/lib/python3.7/site-packages/mmcv/ops/focal_loss.py", line 55, in forward input, target, weight, output, gamma=ctx.gamma, alpha=ctx.alpha) RuntimeError: SigmoidFocalLoss is not compiled with GPU support Traceback (most recent call last): File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in main() File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main cmd=cmd) subprocess.CalledProcessError: Command '['/opt/conda/bin/python3', '-u', 'tools/train.py', '--local_rank=0', 'configs/imvoxelnet/imvoxelnet_sunrgbd_fast.py', '--launcher' , 'pytorch']' returned non-zero exit status 1.

How to solve this error:? RuntimeError: SigmoidFocalLoss is not compiled with GPU support

Should I reinstall mmcv?

Thanks!

gyhandy commented 1 year ago

If I change the data format of ScanNet into the format of SUNRGBD to conduct monocular object detection, I find the camera position in SUNRGBD is always [0,0,0], which is the origin of the point cloud. While the Scannet camera position may not be [0,0,0], should I transform the point cloud in Scannet to make the camera always in the [0,0,0] position?

Theoretically, what is the format of Imvoxelnet prediction? Given an input image, will it predict the position in camera coordinates, and then transform the predicted 3D bbox back to the world coordinate based on the extrinsic to use the GT calculates loss? Thanks!

filaPro commented 1 year ago

Can you please follow #55 for camera position info on SUN RGB-D?