Closed monajalal closed 9 months ago
Hi Mona,
Thanks for your question! Can you run three iterations of trainer.test
(here) and see if the visualizations look good to you?
Hi Chen,
I didn't exactly followed your instruction. Could you please elaborate a bit more and show the command I need to run? Thanks, Mona
Hi Mona,
Right before this line, add trainer.test(0)
.
Right after this line, add pdb.set_trace()
.
Run
LD_LIBRARY_PATH=lib/regressor:$LD_LIBRARY_PATH python src/train_core.py --save_dir /home/mona/HybridPose/saved_weights/linemod/ape --load_dir /home/mona/HybridPose/saved_weights/linemod/ape/checkpoints/0.001/199 --object_name ape
When you hit the breakpoint for the fourth time, use Ctrl+D to quit the program. Go to /home/mona/HybridPose/saved_weights/linemod/ape/image
and inspect the visualizations.
I hope this helps! Let me know if you have further concerns.
Thank you for your response. As for clarification, for hitting the breakpoint 4 times, the first time I run the command, it goes to pdb interactive , then I enter continue
but it keeps running for a long time. Is it intended and do you expect to see something like this? I am still waiting to enter continue
for a second time
(hybridpose) mona@ada:~/HybridPose$ LD_LIBRARY_PATH=lib/regressor:$LD_LIBRARY_PATH python src/train_core.py --save_dir /home/mona/HybridPose/saved_weights/linemod/ape --load_dir /home/mona/HybridPose/saved_weights/linemod/ape/checkpoints/0.001/199 --object_name ape
number of model parameters: 12959563
Successfully loaded model from /home/mona/HybridPose/saved_weights/linemod/ape/checkpoints/0.001/199
Testing...
/home/mona/anaconda3/envs/hybridpose/lib/python3.10/site-packages/torch/nn/functional.py:1967: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
/home/mona/HybridPose/lib/ransac_voting_gpu_layer/ransac_voting_gpu.py:546: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (Triggered internally at ../aten/src/ATen/native/IndexingUtils.h:27.)
direct = vertex[bi].masked_select(torch.unsqueeze(torch.unsqueeze(cur_mask, 2), 3)) # [tn,vn,2]
Loss: 0.3184
> /home/mona/HybridPose/trainers/coretrainer.py(574)generate_data()
-> for i_batch, batch in enumerate(val_loader):
(Pdb)
an update, I never got to enter continue
in front of (pdb) 2 more times since the first time I entered it, it executed through the end.
do you know how to achieve what you suggested?
That said, for the current saved one, when I browsed to the folder that you mentioned, I have these. Do you think they are acceptable? (hybridpose) mona@ada:~/HybridPose$ nautilus /home/mona/HybridPose/saved_weights/linemod/ape/image/0.001
Hi Mona,
The images you showed are from the downloaded weight archive. The hope is that you should see very similar results from your run to these images.
The reason why the code takes so long to run is that the breakpoint you have is in generate_data()
instead of test()
. After running the code, the filenames of the newly generated visualizations should have the prefix 0_
because we are setting epoch
to 0
when calling test(0)
.
Thanks a lot for clarification. After pressing the continue
button 4 times I do not see pts on the objects. As you see 0_2_pts.jpg
has no points while the ground truth 0_2_pts_gt.jpg
has points.
Hi Mona,
Thanks for the follow-up! It looks to me that the keypoint voting procedure is causing the issue. To verify, you can take a look at the _vote_
images. My expectation is that the predicted votes are very similar to the ground-truth ones. This is probably due to an unsuccessful complication of the RANSAC voting layer.
I hope this helps! Let me know if you have further concerns.
When I use the md5sum checked ape weight (199) I get these warnings (and NANs). Do you also get these warnings, negative eigenvalues, and NANs?
(hybridpose) mona@mona-ThinkStation-P7:~/HybridPose$ LD_LIBRARY_PATH=lib/regressor:$LD_LIBRARY_PATH python src/train_core.py --load_dir /home/mona/HybridPose/saved_weights/linemod/ape/checkpoints/0.001/199 --object_name ape
and finally:
Please note that after the
test_set_ape.npy
is saved inoutput
folder, evaluate script results in 0 for ADD-S metric.^^^ which potentially shows running
train_core.py
from trained_weights is not working as expected. Please let me know if you may have any solution?