Hello, I have a few problems with reproducing your results which are probably due to my inexperience with pose estimation:
i) Your code seems to work and I can reproduce decent results for the demo folder. However, when I plug in some YCBV images instead and adapt the intrinsics accordingly to the UW camera values [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0] the resulting images show poses that are significantly off. Should it work to just plug different images into the demo script and if so, do you have an idea why it might fail here and what I can do to fix it?
Here is an example of the clearly incorrect poses:
ii) There seems to be a second issue with the refinement, since it makes the objects almost vanish. Do the YCBV depth maps get misinterpreted or is this due to the division by 1000 when loading the depths or something else?
iii) Since both obviously failed for data that is not the demo data, I concluded that I would have to train the model myself. Is that really necessary and if so, how can I make sure I don't run out of memory? I have two 1080 Ti, but still ran into "CUDA out of memory". I run this in Nvidia Docker with parameters --gpus all --ipc=host. Here is the full stack trace:
iv) I also noticed that after slightly over 1000 steps the YCBV evaluation will break for me (I skipped training and relied on your checkpoints). It says:
[1083/40000], batch time 9.58
./experiments/scripts/ycb_object_test.sh: line 13: 562 Killed ./tools/test_net.py --gpu $1 --network posecnn --pretrained output/ycb_object/ycb_object_train/vgg16_ycb_object_epoch_$2.checkpoint.pth --dataset ycb_object_test --cfg experiments/cfgs/ycb_object.yml
I would greatly appreciated any help - ultimately, any solution that lets me properly reproduce your results on arbitrary RGB-D image pairs with intrinsics or at least YCB would solve my main issue. Thank you very much in advance.
Hello, I have a few problems with reproducing your results which are probably due to my inexperience with pose estimation:
i) Your code seems to work and I can reproduce decent results for the demo folder. However, when I plug in some YCBV images instead and adapt the intrinsics accordingly to the UW camera values [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0] the resulting images show poses that are significantly off. Should it work to just plug different images into the demo script and if so, do you have an idea why it might fail here and what I can do to fix it?
Here is an example of the clearly incorrect poses:
ii) There seems to be a second issue with the refinement, since it makes the objects almost vanish. Do the YCBV depth maps get misinterpreted or is this due to the division by 1000 when loading the depths or something else?
iii) Since both obviously failed for data that is not the demo data, I concluded that I would have to train the model myself. Is that really necessary and if so, how can I make sure I don't run out of memory? I have two 1080 Ti, but still ran into "CUDA out of memory". I run this in Nvidia Docker with parameters
--gpus all --ipc=host
. Here is the full stack trace:iv) I also noticed that after slightly over 1000 steps the YCBV evaluation will break for me (I skipped training and relied on your checkpoints). It says:
I ran it via
I would greatly appreciated any help - ultimately, any solution that lets me properly reproduce your results on arbitrary RGB-D image pairs with intrinsics or at least YCB would solve my main issue. Thank you very much in advance.