Closed relh closed 1 year ago
Hi, we don't have custom data inference, but I think it can be done fairly easily by modifying extract_predicts.py
. In particular, the inference is done with this line.
This is traced back to the forward
function here. It takes inputs
(which contains the input image), and meta_info
(which contains the intrinsics for the weak perspective camera and the object name for each image).
So I think you can prepare these two objects then the code should run fine.
For dataset classes without groundtruth, I think this can be an example.
Thanks so much! If I make a simple inference script I'll make a PR incase you guys want to use it :).
Almost done! I basically just have a short script that loads a checkpoint, loads a mini version of the dataset, runs on a few images out of the dataset, and then generates all the .pt
files that happen after running wrapper.inference
.
There's a bunch of utils in common
and I bet some of them are more useful than others.. if I want to turn my .pt
files into pretty graphics like on comet.ml
is there a best choice?
I'm currently using KEYS
from submit_pose
since that seems to be the best fit for what can be run as "inference", let me know if a different set of KEYS
seem better.
Separately, do you think that the model would "just work" on new 3D object models? I was thinking of trying to get it running on DexYCB which also includes 3D models for objects!
I'd expect with only 11 objects during training that the "object decoder" (although using image-only CNN features) would just sort of learn object-class-specific articulated pose.
Thanks. I think a better to visualize is to use our viewer. See Visualization. There is an extraction mode vis_pose
, which dumps the predictions to disk and the viewer can then visualize with pred_mesh
flag to show the predictions in AITViewer. I think one can just extend vis_pose
extraction to vis_pose_pred
extraction and visualize with pred_mesh
. The viewer supports offline rendering to render every frame in the prediction as an image. It also looks better than the graphics on Comet. The visualization on Comet is just to allow users to check if the training seems reasonable.
I think if you pre-train on ARCTIC, and test on DexYCB directly, it won't work. Because there is no canonical way to define the 6D pose for an object (e.g., consider a coffee mug, it has a rotation along the height axis). However, it would be interesting to see if pretraining on ARCTIC and finetuning on DexYCB will improve YCB object estimation. We experimented this on HO3D and it converges faster after pretraining.
Closing stale issue. Please reopen for further discussion.
Can you provide a simple outline for how you would run inference using ArcticNet-SF/LSTM on custom data? I didn't see it in the DOCS but I might have missed it!
I'm assuming I might want to modify
extract_predicts.py