Closed Champonn closed 1 year ago
Hi, it has only been tested on Linux. But on Windows, using Docker might still be feasible to reproduce the environment.
BTW, the DATA links cannot be accessed now~
Thanks for bringing it up! The server is down and should be back within the next 2-3 days.
Thanks for your reply! After pulling the images and updating the paths, there is an error that comes after the "Run bash docker/run_container.sh". And I couldn't find the solution, could you please help me to figure it out?
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/50a36ab3939a7d79c445b226409678d9c7451d1975ea6c9a6624e5eb9a346fbc/merged/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: file exists: unknown. ERRO[0000] error waiting for container: context canceled
are you running on a Windows machine? And do you have NVIDIA GPU and its driver installed?
@Champonn the data should be back. Let me know if you still cannot access.
@Champonn the data should be back. Let me know if you still cannot access.
Thanks!
are you running on a Windows machine? And do you have NVIDIA GPU and its driver installed?
Sure, CUDA and NVIDIA GPU as well as its driver are all installed. Now I run it in WSL(Windows Subsystem for Linux), but comes the same error info...
From the picture it looks like you are already in the docker container. What if you ignore the error and continue?
Hi bowen, I moved to Ubuntu 18.04 instead of WSL, but this error came out after the command python run_server.py
(base) root@siat-Precision-5820-Tower-X-Series:/home/siat/BundleTrack# cd lf-net-release && python run_server.py
/opt/conda/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6 return f(*args, **kwds) Loading from /home/siat/BundleTrack/lf-net-release/release/models/indoor/config.pkl
---------------------- OPTIONS ----------------------
activ_fn leaky_relu
aug_max_degree 180
aug_max_scale 1.4142135623730951
batch_size 6
clear_logs False
com_strength 3.0
conv_ksize 5
crop_radius 16
data_raw_size 362
data_size 256
dataset scannet
depth_thresh 1.0
desc_activ_fn relu
desc_conv_ksize 3
desc_dim 256
desc_inputs photos
desc_leaky_alpha 0.2
desc_loss triplet
desc_margin 1.0
desc_net_channel 64
desc_net_depth 3
desc_norm l2norm
desc_perform_bn True
desc_train_delay 0
descriptor simple_desc
det_loss l2loss
detector mso_resnet_detector
do_softmax_kp_refine True
hard_geom_thresh False
hm_ksize 15
hm_sigma 0.5
hpatches_dir /scratch/trulls/yuki.ono/datasets/hpatches
init_num_mine 64
input_inst_norm True
kp_com_strength 1.0
kp_loc_size 9
leaky_alpha 0.2
log_dir /scratch/trulls/yuki.ono/results/deep_det/desc/180705-scannet-ori-sv-3d/adam-lr-1e-3-False/mso_resnet_detector/scannet-15/aug-rTrue-180-sTrue/ori-True-5-scl-5-0.7-1.4/desc-photos-D256-topk-512/mine-rand_hard_sch-64-5-0.9/innorm-True/wdet-0.01/ori-l2loss-w-0.1/scl-0.1/nms3d-True/try-1
lr 0.001
lr_decay False
match_reproj_thresh 5
max_itr 50000
max_seq_length 2000
min_num_pickup 5
mining_type rand_hard_sch
net_block 3
net_channel 16
net_max_scale 1.4142135623730951
net_min_scale 0.7071067811865475
net_num_scales 5
nms_ksize 5
nms_thresh 0.0
num_threads 16
optim_method adam
ori_ksize 5
ori_loss l2loss
ori_weight 0.1
patch_size 32
perform_bn True
pickup_delay 0.9
pretrain_dir
random_offset False
rot_aug True
scale_aug True
scale_com_strength 100.0
scale_weight 0.1
scannet_dir /scratch/trulls/yuki.ono/datasets/scannet/dataset
scenenet_dir /scratch/trulls/yuki.ono/datasets/scenenet
score_com_strength 100.0
sfm_dpt_dir /scratch/trulls/yuki.ono/datasets/colmap/dataset2/vis-0.4
sfm_img_dir /scratch/trulls/yuki.ono/datasets/colmap/colmap
sfm_mode nips
sfm_seq sacre_coeur
sfm_train_seq train.txt
sfm_valid_seq valid.txt
show_histogram False
sm_ksize 15
soft_kpts True
soft_scale True
top_k 512
train_main_seq 0
train_num_traj 100
train_ori True
train_pair_offset 15
train_same_time True
use_nms3d True
valid_num_traj 10
valid_pair_offset 15
webcam_dir /scratch/trulls/yuki.ono/datasets/WebcamRelease
weight_det_loss 0.01
Traceback (most recent call last): File "run_server.py", line 113, in <module> ops = build_networks(config, photo_ph, is_training) File "run_server.py", line 30, in build_networks DET = importlib.import_module(config.detector) File "/opt/conda/lib/python3.6/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 994, in _gcd_import File "<frozen importlib._bootstrap>", line 971, in _find_and_load File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked ModuleNotFoundError: No module named 'mso_resnet_detector'
Could you help to find out what's going on?
I just checked and didn't meet this problem. Can you confirm you 1) have done [Download weights of feature detection network]
; 2) you are running this inside lfnet's docker container (not bundletrack's) by bash lf-net-release/docker/run_container.sh
There is a same issue and they somehow solved it https://github.com/wenbowen123/BundleTrack/issues/10
Yeah, I downloaded the file. But in my folder, there is: lf-net-release/models not If-net-release/release/models(Not sure if this the problem ). So I created release and moved models folder under it. Then extract indoor file and put it under models, so it's BundleTrack/If-net-release/release/models/indoor.
And yeah, I’m running this inside lf-net's docker container(a new terminal and run the If-net run_container.sh as you said).
Did you run python run_server.py
under the directory of lf-net-release? (not bundletrack)
Nope…
(base) root@siat-Precision-5820-Tower-X-Series:/home/siat/BundleTrack# cd lf-net-release && python run_server.py
Hi bowen, I moved to Ubuntu 18.04 instead of WSL, but this error came out after the command
python run_server.py
(base) root@siat-Precision-5820-Tower-X-Series:/home/siat/BundleTrack# cd lf-net-release && python run_server.py
/opt/conda/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6 return f(*args, **kwds) Loading from /home/siat/BundleTrack/lf-net-release/release/models/indoor/config.pkl
---------------------- OPTIONS ----------------------
activ_fn leaky_relu aug_max_degree 180 aug_max_scale 1.4142135623730951 batch_size 6 clear_logs False com_strength 3.0 conv_ksize 5 crop_radius 16 data_raw_size 362 data_size 256 dataset scannet depth_thresh 1.0 desc_activ_fn relu desc_conv_ksize 3 desc_dim 256 desc_inputs photos desc_leaky_alpha 0.2 desc_loss triplet desc_margin 1.0 desc_net_channel 64 desc_net_depth 3 desc_norm l2norm desc_perform_bn True desc_train_delay 0 descriptor simple_desc det_loss l2loss detector mso_resnet_detector do_softmax_kp_refine True hard_geom_thresh False hm_ksize 15 hm_sigma 0.5 hpatches_dir /scratch/trulls/yuki.ono/datasets/hpatches init_num_mine 64 input_inst_norm True kp_com_strength 1.0 kp_loc_size 9 leaky_alpha 0.2 log_dir /scratch/trulls/yuki.ono/results/deep_det/desc/180705-scannet-ori-sv-3d/adam-lr-1e-3-False/mso_resnet_detector/scannet-15/aug-rTrue-180-sTrue/ori-True-5-scl-5-0.7-1.4/desc-photos-D256-topk-512/mine-rand_hard_sch-64-5-0.9/innorm-True/wdet-0.01/ori-l2loss-w-0.1/scl-0.1/nms3d-True/try-1 lr 0.001 lr_decay False match_reproj_thresh 5 max_itr 50000 max_seq_length 2000 min_num_pickup 5 mining_type rand_hard_sch net_block 3 net_channel 16 net_max_scale 1.4142135623730951 net_min_scale 0.7071067811865475 net_num_scales 5 nms_ksize 5 nms_thresh 0.0 num_threads 16 optim_method adam ori_ksize 5 ori_loss l2loss ori_weight 0.1 patch_size 32 perform_bn True pickup_delay 0.9 pretrain_dir random_offset False rot_aug True scale_aug True scale_com_strength 100.0 scale_weight 0.1 scannet_dir /scratch/trulls/yuki.ono/datasets/scannet/dataset scenenet_dir /scratch/trulls/yuki.ono/datasets/scenenet score_com_strength 100.0 sfm_dpt_dir /scratch/trulls/yuki.ono/datasets/colmap/dataset2/vis-0.4 sfm_img_dir /scratch/trulls/yuki.ono/datasets/colmap/colmap sfm_mode nips sfm_seq sacre_coeur sfm_train_seq train.txt sfm_valid_seq valid.txt show_histogram False sm_ksize 15 soft_kpts True soft_scale True top_k 512 train_main_seq 0 train_num_traj 100 train_ori True train_pair_offset 15 train_same_time True use_nms3d True valid_num_traj 10 valid_pair_offset 15 webcam_dir /scratch/trulls/yuki.ono/datasets/WebcamRelease weight_det_loss 0.01
Traceback (most recent call last): File "run_server.py", line 113, in <module> ops = build_networks(config, photo_ph, is_training) File "run_server.py", line 30, in build_networks DET = importlib.import_module(config.detector) File "/opt/conda/lib/python3.6/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 994, in _gcd_import File "<frozen importlib._bootstrap>", line 971, in _find_and_load File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked ModuleNotFoundError: No module named 'mso_resnet_detector'
Could you help to find out what's going on?
Will it relate to the GPU driver and CUDA version? Just like this #22 . I’m using 3090 with CUDA11.4.
I uninstalled the CUDA 11.4 and make it CUDA 11.2 with driver470. But still comes the same problem.
I also met this issue, I guess it is the file directories causes this error, change the line 74 in run_server.py
to model_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory')
, remove the release
, and copy indoor
folder to lf-net-release/models/
. It can work.
I also met this issue, I guess it is the file directories causes this error, change the line 74 in
run_server.py
tomodel_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory')
, remove therelease
, and copyindoor
folder tolf-net-release/models/
. It can work.
Oh thanks for your advice. Will try it.
I also met this issue, I guess it is the file directories causes this error, change the line 74 in
run_server.py
tomodel_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory')
, remove therelease
, and copyindoor
folder tolf-net-release/models/
. It can work.
Thank, it works! But now comes Segmentation fault (core dumped). And I checked the issue #6 Should I have a gts folder? Here is my NOCS folder:
@zhuhu00 Thanks for bringing it up. I'll update the readme.
@Champonn I'd suggest you download and see if the problem goes away.
I also met this issue, I guess it is the file directories causes this error, change the line 74 in
run_server.py
tomodel_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory')
, remove therelease
, and copyindoor
folder tolf-net-release/models/
. It can work.Thank, it works! But now comes Segmentation fault (core dumped). And I checked the issue #6 Should I have a gts folder? Here is my NOCS folder:
@wenbowen123
I also met this issue, I guess it is the file directories causes this error, change the line 74 in
run_server.py
tomodel_arg.add_argument('--model', type=str, default=f'{code_dir}/models/indoor/', help='model file or directory')
, remove therelease
, and copyindoor
folder tolf-net-release/models/
. It can work.Thank, it works! But now comes Segmentation fault (core dumped). And I checked the issue #6 Should I have a gts folder? Here is my NOCS folder:
@wenbowen123
@wenbowen123 Hi, as you present in readme the file in NOCS is: NOCS ├── NOCS-REAL275-additional ├── real_test └── obj_models But the result shows that there is NOCS/gts/real_test_text/scene_1/model_can_arizona_tea_norm/ So just wanna know if I got the wrong file...
Hi, did you download the gt files "download the converted text pose files from here," mentioned in https://github.com/wenbowen123/BundleTrack#run-predictions-on-nocs
Thanks ,I download it and work again, but I also can't get the result ,With reference to #15,I add a line _fm->vizKeyPoints(frame) at [BundleTrack/src/Bundler.cpp] https://github.com/wenbowen123/BundleTrack/issues/15#issuecomment-1007207312 Line 118 ,it performances something wrong.like this:
OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call, file /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 270 terminate called after throwing an instance of 'cv::Exception' what(): /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp:270: error: (-217) no kernel image is available for execution on the device in function callSo could you tell me what's wrong in here..........(.mtl file should not affect running,right? )Aborted (core dumped)
Hi, did you download the gt files "download the converted text pose files from here," mentioned in https://github.com/wenbowen123/BundleTrack#run-predictions-on-nocs
A duplicate problem and solution. https://github.com/wenbowen123/BundleTrack/issues/22#issuecomment-1050114926 Are you using a 3090 or similar GPU?
A duplicate problem and solution. #22 (comment) Are you using a 3090 or similar GPU?
yes,I'm using a 3090
A duplicate problem and solution. #22 (comment) Are you using a 3090 or similar GPU?
so maybe should I change the version of CUDA ?Now it's CUDA 11.2 with driver470
I just updated the codebase. can you git pull then rebuild, and see if it works?
I just updated the codebase. can you git pull then rebuild, and see if it works?
no...... when I run "rm -rf build && mkdir build && cd build && cmake .. && make",it shows errors:
The NVIDIA version inside the current docker only supports older GPUs than 3090. So I'd suggest you try on 2080 or 1080 etc.
I just updated the codebase. can you git pull then rebuild, and see if it works?
no...... when I run "rm -rf build && mkdir build && cd build && cmake .. && make",it shows errors:
Thanks ,I download it and work again, but I also can't get the result ,With reference to #15,I add a line _fm->vizKeyPoints(frame) at [BundleTrack/src/Bundler.cpp] #15 (comment) Line 118 ,it performances something wrong.like this:
OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call, file /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 270
terminate called after throwing an instance of 'cv::Exception'
what(): /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp:270: error: (-217) no kernel image is available for execution on the device in function call
Aborted (core dumped)
So could you tell me what's wrong in here..........(.mtl file should not affect running,right? )
Hi, did you download the gt files "download the converted text pose files from here," mentioned in https://github.com/wenbowen123/BundleTrack#run-predictions-on-nocs
Hello! I met the same problem with you. Do you solve the problem?
The NVIDIA version inside the current docker only supports older GPUs than 3090. So I'd suggest you try on 2080 or 1080 etc.
OK,I will try it on other computer,thanks very much.
I just updated the codebase. can you git pull then rebuild, and see if it works?
no...... when I run "rm -rf build && mkdir build && cd build && cmake .. && make",it shows errors:
Thanks ,I download it and work again, but I also can't get the result ,With reference to #15,I add a line _fm->vizKeyPoints(frame) at [BundleTrack/src/Bundler.cpp] #15 (comment) Line 118 ,it performances something wrong.like this: OpenCV Error: Gpu API call (no kernel image is available for execution on the device) in call, file /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp, line 270 terminate called after throwing an instance of 'cv::Exception' what(): /opencv/modules/cudev/include/opencv2/cudev/grid/detail/transform.hpp:270: error: (-217) no kernel image is available for execution on the device in function call Aborted (core dumped) So could you tell me what's wrong in here..........(.mtl file should not affect running,right? )
Hi, did you download the gt files "download the converted text pose files from here," mentioned in https://github.com/wenbowen123/BundleTrack#run-predictions-on-nocs
Hello! I met the same problem with you. Do you solve the problem?
Hello!.mti file doesn't affect the result With reference to https://github.com/wenbowen123/BundleTrack/issues/15. but if you meet the error of OpenCV Error: Gpu API call, sorry , I also haven't solve it.
The NVIDIA version inside the current docker only supports older GPUs than 3090. So I'd suggest you try on 2080 or 1080 etc.
What changes would I need to make to update the docker nvidia version such that it would support the 3080 @wenbowen123?
@raghavauppuluri13 @Champonn @igodrr Anybody figured out what are the files to change to support newer GPUs? Thank you!
we have a different docker version that supports newer GPU, such as 3090 https://hub.docker.com/layers/wenbowen123/bundletrack/3090/images/sha256-55b844caf88344bce28ad21638977aeb5c621bd5d3b7479cc082cc625ebd389a?context=repo
BTW, the DATA links cannot be accessed now~