wenbowen123 / BundleTrack

[IROS 2021] BundleTrack: 6D Pose Tracking for Novel Objects without Instance or Category-Level 3D Models
Other
610 stars 66 forks source link

Can this project be used in Windows? #31

Closed Champonn closed 1 year ago

Champonn commented 2 years ago

BTW, the DATA links cannot be accessed now~ image

wenbowen123 commented 2 years ago

Hi, it has only been tested on Linux. But on Windows, using Docker might still be feasible to reproduce the environment.

wenbowen123 commented 2 years ago

BTW, the DATA links cannot be accessed now~ image

Thanks for bringing it up! The server is down and should be back within the next 2-3 days.

Champonn commented 2 years ago

Thanks for your reply! After pulling the images and updating the paths, there is an error that comes after the "Run bash docker/run_container.sh". And I couldn't find the solution, could you please help me to figure it out? image

docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/50a36ab3939a7d79c445b226409678d9c7451d1975ea6c9a6624e5eb9a346fbc/merged/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: file exists: unknown. ERRO[0000] error waiting for container: context canceled

wenbowen123 commented 2 years ago

are you running on a Windows machine? And do you have NVIDIA GPU and its driver installed?

wenbowen123 commented 2 years ago

@Champonn the data should be back. Let me know if you still cannot access.

Champonn commented 2 years ago

@Champonn the data should be back. Let me know if you still cannot access.

Thanks!

Champonn commented 2 years ago

are you running on a Windows machine? And do you have NVIDIA GPU and its driver installed?

Sure, CUDA and NVIDIA GPU as well as its driver are all installed. Now I run it in WSL(Windows Subsystem for Linux), but comes the same error info... image

wenbowen123 commented 2 years ago

From the picture it looks like you are already in the docker container. What if you ignore the error and continue?

Champonn commented 2 years ago

Hi bowen, I moved to Ubuntu 18.04 instead of WSL, but this error came out after the command python run_server.py

(base) root@siat-Precision-5820-Tower-X-Series:/home/siat/BundleTrack# cd lf-net-release && python run_server.py /opt/conda/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6 return f(*args, **kwds) Loading from /home/siat/BundleTrack/lf-net-release/release/models/indoor/config.pkl ---------------------- OPTIONS ----------------------

             activ_fn  leaky_relu
       aug_max_degree  180
        aug_max_scale  1.4142135623730951
           batch_size  6
           clear_logs  False
         com_strength  3.0
           conv_ksize  5
          crop_radius  16
        data_raw_size  362
            data_size  256
              dataset  scannet
         depth_thresh  1.0
        desc_activ_fn  relu
      desc_conv_ksize  3
             desc_dim  256
          desc_inputs  photos
     desc_leaky_alpha  0.2
            desc_loss  triplet
          desc_margin  1.0
     desc_net_channel  64
       desc_net_depth  3
            desc_norm  l2norm
      desc_perform_bn  True
     desc_train_delay  0
           descriptor  simple_desc
             det_loss  l2loss
             detector  mso_resnet_detector
 do_softmax_kp_refine  True
     hard_geom_thresh  False
             hm_ksize  15
             hm_sigma  0.5
         hpatches_dir  /scratch/trulls/yuki.ono/datasets/hpatches
        init_num_mine  64
      input_inst_norm  True
      kp_com_strength  1.0
          kp_loc_size  9
          leaky_alpha  0.2
              log_dir  /scratch/trulls/yuki.ono/results/deep_det/desc/180705-scannet-ori-sv-3d/adam-lr-1e-3-False/mso_resnet_detector/scannet-15/aug-rTrue-180-sTrue/ori-True-5-scl-5-0.7-1.4/desc-photos-D256-topk-512/mine-rand_hard_sch-64-5-0.9/innorm-True/wdet-0.01/ori-l2loss-w-0.1/scl-0.1/nms3d-True/try-1
                   lr  0.001
             lr_decay  False
  match_reproj_thresh  5
              max_itr  50000
       max_seq_length  2000
       min_num_pickup  5
          mining_type  rand_hard_sch
            net_block  3
          net_channel  16
        net_max_scale  1.4142135623730951
        net_min_scale  0.7071067811865475
       net_num_scales  5
            nms_ksize  5
           nms_thresh  0.0
          num_threads  16
         optim_method  adam
            ori_ksize  5
             ori_loss  l2loss
           ori_weight  0.1
           patch_size  32
           perform_bn  True
         pickup_delay  0.9
         pretrain_dir  
        random_offset  False
              rot_aug  True
            scale_aug  True
   scale_com_strength  100.0
         scale_weight  0.1
          scannet_dir  /scratch/trulls/yuki.ono/datasets/scannet/dataset
         scenenet_dir  /scratch/trulls/yuki.ono/datasets/scenenet
   score_com_strength  100.0
          sfm_dpt_dir  /scratch/trulls/yuki.ono/datasets/colmap/dataset2/vis-0.4
          sfm_img_dir  /scratch/trulls/yuki.ono/datasets/colmap/colmap
             sfm_mode  nips
              sfm_seq  sacre_coeur
        sfm_train_seq  train.txt
        sfm_valid_seq  valid.txt
       show_histogram  False
             sm_ksize  15
            soft_kpts  True
           soft_scale  True
                top_k  512
       train_main_seq  0
       train_num_traj  100
            train_ori  True
    train_pair_offset  15
      train_same_time  True
            use_nms3d  True
       valid_num_traj  10
    valid_pair_offset  15
           webcam_dir  /scratch/trulls/yuki.ono/datasets/WebcamRelease
      weight_det_loss  0.01

Traceback (most recent call last): File "run_server.py", line 113, in <module> ops = build_networks(config, photo_ph, is_training) File "run_server.py", line 30, in build_networks DET = importlib.import_module(config.detector) File "/opt/conda/lib/python3.6/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 994, in _gcd_import File "<frozen importlib._bootstrap>", line 971, in _find_and_load File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked ModuleNotFoundError: No module named 'mso_resnet_detector'

Could you help to find out what's going on?

wenbowen123 commented 2 years ago

I just checked and didn't meet this problem. Can you confirm you 1) have done [Download weights of feature detection network]; 2) you are running this inside lfnet's docker container (not bundletrack's) by bash lf-net-release/docker/run_container.sh

There is a same issue and they somehow solved it https://github.com/wenbowen123/BundleTrack/issues/10

Champonn commented 2 years ago

Yeah, I downloaded the file. But in my folder, there is: lf-net-release/models not If-net-release/release/models(Not sure if this the problem ). So I created release and moved models folder under it. Then extract indoor file and put it under models, so it's BundleTrack/If-net-release/release/models/indoor.

And yeah, I’m running this inside lf-net's docker container(a new terminal and run the If-net run_container.sh as you said).

wenbowen123 commented 2 years ago

Did you run python run_server.py under the directory of lf-net-release? (not bundletrack)

Champonn commented 2 years ago

Nope… (base) root@siat-Precision-5820-Tower-X-Series:/home/siat/BundleTrack# cd lf-net-release && python run_server.py

Hi bowen, I moved to Ubuntu 18.04 instead of WSL, but this error came out after the command python run_server.py

(base) root@siat-Precision-5820-Tower-X-Series:/home/siat/BundleTrack# cd lf-net-release && python run_server.py /opt/conda/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6 return f(*args, **kwds) Loading from /home/siat/BundleTrack/lf-net-release/release/models/indoor/config.pkl ---------------------- OPTIONS ----------------------

             activ_fn  leaky_relu
       aug_max_degree  180
        aug_max_scale  1.4142135623730951
           batch_size  6
           clear_logs  False
         com_strength  3.0
           conv_ksize  5
          crop_radius  16
        data_raw_size  362
            data_size  256
              dataset  scannet
         depth_thresh  1.0
        desc_activ_fn  relu
      desc_conv_ksize  3
             desc_dim  256
          desc_inputs  photos
     desc_leaky_alpha  0.2
            desc_loss  triplet
          desc_margin  1.0
     desc_net_channel  64
       desc_net_depth  3
            desc_norm  l2norm
      desc_perform_bn  True
     desc_train_delay  0
           descriptor  simple_desc
             det_loss  l2loss
             detector  mso_resnet_detector
 do_softmax_kp_refine  True
     hard_geom_thresh  False
             hm_ksize  15
             hm_sigma  0.5
         hpatches_dir  /scratch/trulls/yuki.ono/datasets/hpatches
        init_num_mine  64
      input_inst_norm  True
      kp_com_strength  1.0
          kp_loc_size  9
          leaky_alpha  0.2
              log_dir  /scratch/trulls/yuki.ono/results/deep_det/desc/180705-scannet-ori-sv-3d/adam-lr-1e-3-False/mso_resnet_detector/scannet-15/aug-rTrue-180-sTrue/ori-True-5-scl-5-0.7-1.4/desc-photos-D256-topk-512/mine-rand_hard_sch-64-5-0.9/innorm-True/wdet-0.01/ori-l2loss-w-0.1/scl-0.1/nms3d-True/try-1
                   lr  0.001
             lr_decay  False
  match_reproj_thresh  5
              max_itr  50000
       max_seq_length  2000
       min_num_pickup  5
          mining_type  rand_hard_sch
            net_block  3
          net_channel  16
        net_max_scale  1.4142135623730951
        net_min_scale  0.7071067811865475
       net_num_scales  5
            nms_ksize  5
           nms_thresh  0.0
          num_threads  16
         optim_method  adam
            ori_ksize  5
             ori_loss  l2loss
           ori_weight  0.1
           patch_size  32
           perform_bn  True
         pickup_delay  0.9
         pretrain_dir  
        random_offset  False
              rot_aug  True
            scale_aug  True
   scale_com_strength  100.0
         scale_weight  0.1
          scannet_dir  /scratch/trulls/yuki.ono/datasets/scannet/dataset
         scenenet_dir  /scratch/trulls/yuki.ono/datasets/scenenet
   score_com_strength  100.0
          sfm_dpt_dir  /scratch/trulls/yuki.ono/datasets/colmap/dataset2/vis-0.4
          sfm_img_dir  /scratch/trulls/yuki.ono/datasets/colmap/colmap
             sfm_mode  nips
              sfm_seq  sacre_coeur
        sfm_train_seq  train.txt
        sfm_valid_seq  valid.txt
       show_histogram  False
             sm_ksize  15
            soft_kpts  True
           soft_scale  True
                top_k  512
       train_main_seq  0
       train_num_traj  100
            train_ori  True
    train_pair_offset  15
      train_same_time  True
            use_nms3d  True
       valid_num_traj  10
    valid_pair_offset  15
           webcam_dir  /scratch/trulls/yuki.ono/datasets/WebcamRelease
      weight_det_loss  0.01

Traceback (most recent call last): File "run_server.py", line 113, in <module> ops = build_networks(config, photo_ph, is_training) File "run_server.py", line 30, in build_networks DET = importlib.import_module(config.detector) File "/opt/conda/lib/python3.6/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 994, in _gcd_import File "<frozen importlib._bootstrap>", line 971, in _find_and_load File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked ModuleNotFoundError: No module named 'mso_resnet_detector'

Could you help to find out what's going on?

Champonn commented 2 years ago

Will it relate to the GPU driver and CUDA version? Just like this #22 . I’m using 3090 with CUDA11.4.

Champonn commented 2 years ago

I uninstalled the CUDA 11.4 and make it CUDA 11.2 with driver470. But still comes the same problem. 2022-04-25 15-58-01 的屏幕截图 2022-04-25 15-58-22 的屏幕截图 2022-04-25 15-58-34 的屏幕截图

zhuhu00 commented 2 years ago

I also met this issue, I guess it is the file directories causes this error, change the line 74 in run_server.py to model_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory'), remove the release, and copy indoor folder to lf-net-release/models/. It can work.

Champonn commented 2 years ago

I also met this issue, I guess it is the file directories causes this error, change the line 74 in run_server.py to model_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory'), remove the release, and copy indoor folder to lf-net-release/models/. It can work.

Oh thanks for your advice. Will try it.

Champonn commented 2 years ago

I also met this issue, I guess it is the file directories causes this error, change the line 74 in run_server.py to model_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory'), remove the release, and copy indoor folder to lf-net-release/models/. It can work.

Thank, it works! But now comes Segmentation fault (core dumped). And I checked the issue #6 Should I have a gts folder? Here is my NOCS folder: 2022-04-25 23-10-37 的屏幕截图 2022-04-25 23-10-50 的屏幕截图

wenbowen123 commented 2 years ago

@zhuhu00 Thanks for bringing it up. I'll update the readme.

@Champonn I'd suggest you download and see if the problem goes away.

Champonn commented 2 years ago

I also met this issue, I guess it is the file directories causes this error, change the line 74 in run_server.py to model_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory'), remove the release, and copy indoor folder to lf-net-release/models/. It can work.

Thank, it works! But now comes Segmentation fault (core dumped). And I checked the issue #6 Should I have a gts folder? Here is my NOCS folder: 2022-04-25 23-10-37 的屏幕截图 2022-04-25 23-10-50 的屏幕截图

@wenbowen123

Champonn commented 2 years ago

I also met this issue, I guess it is the file directories causes this error, change the line 74 in run_server.py to model_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory'), remove the release, and copy indoor folder to lf-net-release/models/. It can work.

Thank, it works! But now comes Segmentation fault (core dumped). And I checked the issue #6 Should I have a gts folder? Here is my NOCS folder: 2022-04-25 23-10-37 的屏幕截图 2022-04-25 23-10-50 的屏幕截图

@wenbowen123

@wenbowen123 Hi, as you present in readme the file in NOCS is: NOCS ├── NOCS-REAL275-additional ├── real_test └── obj_models But the result shows that there is NOCS/gts/real_test_text/scene_1/model_can_arizona_tea_norm/ 2022-04-27 12-39-36 的屏幕截图 So just wanna know if I got the wrong file...

wenbowen123 commented 2 years ago

Hi, did you download the gt files "download the converted text pose files from here," mentioned in https://github.com/wenbowen123/BundleTrack#run-predictions-on-nocs

Champonn commented 2 years ago

Thanks ,I download it and work again, but I also can't get the result ,With reference to #15,I add a line _fm->vizKeyPoints(frame) at [BundleTrack/src/Bundler.cpp] https://github.com/wenbowen123/BundleTrack/issues/15#issuecomment-1007207312 Line 118 ,it performances something wrong.like this:

OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call, file /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 270
terminate called after throwing an instance of 'cv::Exception'
what():  /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp:270: error: (-217) no kernel image is available for execution on the device in function call

Aborted (core dumped)

So could you tell me what's wrong in here..........(.mtl file should not affect running,right? ) image

Hi, did you download the gt files "download the converted text pose files from here," mentioned in https://github.com/wenbowen123/BundleTrack#run-predictions-on-nocs

wenbowen123 commented 2 years ago

A duplicate problem and solution. https://github.com/wenbowen123/BundleTrack/issues/22#issuecomment-1050114926 Are you using a 3090 or similar GPU?

Champonn commented 2 years ago

A duplicate problem and solution. #22 (comment) Are you using a 3090 or similar GPU?

yes,I'm using a 3090

Champonn commented 2 years ago

A duplicate problem and solution. #22 (comment) Are you using a 3090 or similar GPU?

so maybe should I change the version of CUDA ?Now it's CUDA 11.2 with driver470

wenbowen123 commented 2 years ago

I just updated the codebase. can you git pull then rebuild, and see if it works?

Champonn commented 2 years ago

I just updated the codebase. can you git pull then rebuild, and see if it works?

no...... when I run "rm -rf build && mkdir build && cd build && cmake .. && make",it shows errors: image

wenbowen123 commented 2 years ago

The NVIDIA version inside the current docker only supports older GPUs than 3090. So I'd suggest you try on 2080 or 1080 etc.

igodrr commented 2 years ago

I just updated the codebase. can you git pull then rebuild, and see if it works?

no...... when I run "rm -rf build && mkdir build && cd build && cmake .. && make",it shows errors: image

Thanks ,I download it and work again, but I also can't get the result ,With reference to #15,I add a line _fm->vizKeyPoints(frame) at [BundleTrack/src/Bundler.cpp] #15 (comment) Line 118 ,it performances something wrong.like this:

OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call, file /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 270

terminate called after throwing an instance of 'cv::Exception'

what(): /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp:270: error: (-217) no kernel image is available for execution on the device in function call

Aborted (core dumped)

So could you tell me what's wrong in here..........(.mtl file should not affect running,right? ) image

Hi, did you download the gt files "download the converted text pose files from here," mentioned in https://github.com/wenbowen123/BundleTrack#run-predictions-on-nocs

Hello! I met the same problem with you. Do you solve the problem?

Champonn commented 2 years ago

The NVIDIA version inside the current docker only supports older GPUs than 3090. So I'd suggest you try on 2080 or 1080 etc.

OK,I will try it on other computer,thanks very much.

Champonn commented 2 years ago

I just updated the codebase. can you git pull then rebuild, and see if it works?

no...... when I run "rm -rf build && mkdir build && cd build && cmake .. && make",it shows errors: image

Thanks ,I download it and work again, but I also can't get the result ,With reference to #15,I add a line _fm->vizKeyPoints(frame) at [BundleTrack/src/Bundler.cpp] #15 (comment) Line 118 ,it performances something wrong.like this: OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call, file /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 270 terminate called after throwing an instance of 'cv::Exception' what(): /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp:270: error: (-217) no kernel image is available for execution on the device in function call Aborted (core dumped) So could you tell me what's wrong in here..........(.mtl file should not affect running,right? ) image

Hi, did you download the gt files "download the converted text pose files from here," mentioned in https://github.com/wenbowen123/BundleTrack#run-predictions-on-nocs

Hello! I met the same problem with you. Do you solve the problem?

Hello!.mti file doesn't affect the result With reference to https://github.com/wenbowen123/BundleTrack/issues/15. but if you meet the error of OpenCV Error: Gpu API call, sorry , I also haven't solve it.

raghavauppuluri13 commented 2 years ago

The NVIDIA version inside the current docker only supports older GPUs than 3090. So I'd suggest you try on 2080 or 1080 etc.

What changes would I need to make to update the docker nvidia version such that it would support the 3080 @wenbowen123?

bibekyess commented 2 years ago

@raghavauppuluri13 @Champonn @igodrr Anybody figured out what are the files to change to support newer GPUs? Thank you!

wenbowen123 commented 1 year ago

we have a different docker version that supports newer GPU, such as 3090 https://hub.docker.com/layers/wenbowen123/bundletrack/3090/images/sha256-55b844caf88344bce28ad21638977aeb5c621bd5d3b7479cc082cc625ebd389a?context=repo