Open agenthong opened 3 years ago
Hey, @agenthong! Yes, this tensor consists of 3D locations of joints. You can project it back to image using calibration matrix of your camera.
Hey, @agenthong! Yes, this tensor consists of 3D locations of joints. You can project it back to image using calibration matrix of your camera.
Thanks for replying. Do you have this step in the code? Maybe you can point out where it is.
Hi @agenthong @karfly , I also use my own data to predict 3D pose like this:
my question is how to visualize the result like it:
I generate the result as follows:
Whether the predicted result needs further post-processing?
You can find example here.
Yeah, but I want to get 3D joints, it projects the tensor to 2D images.
@agenthong To convert 3D points to your camera coordinate system, you need to apply rotation (R) and translation (t) to these 3D points.
@chaisheng-dawnlight Try to visualize these 3D points without any processing with plt scatter 3D function (https://www.geeksforgeeks.org/3d-scatter-plotting-in-python-using-matplotlib/)
@agenthong To convert 3D points to your camera coordinate system, you need to apply rotation (R) and translation (t) to these 3D points.
Thanks a lot! So it means that this tensor is the 3D joints in world coordinate system?
@chaisheng-dawnlight Try to visualize these 3D points without any processing with plt scatter 3D function (https://www.geeksforgeeks.org/3d-scatter-plotting-in-python-using-matplotlib/)
@karfly Hi, Thanks for your reply. I use the scatter function to draw the 3D pose, but it's still not work. This is my visualize code:
@agenthong Actually in the coordinates of the 1st camera. It usually equals to world coordinates.
@chaisheng-dawnlight What plot do you get from this code?
@agenthong Actually in the coordinates of the 1st camera. It usually equals to world coordinates.
I have GT 3D keypoints like this: It's quite different from my result: To sum up, what's the format of output? And how can I compare my prediction with GT in same type?
Hi, I'm also experiencing similar problem. I followed the comment like @karfly said.
@chaisheng-dawnlight Try to visualize these 3D points without any processing with plt scatter 3D function (https://www.geeksforgeeks.org/3d-scatter-plotting-in-python-using-matplotlib/)
The Ground Truth points looks like this, I used draw_3d_pose
in this repository :
And the pretrained model prediction looks like this :
I've plotted the 3d points like you said, and it looks like this :
I currently have no idea what error made this result..
I used 4 views of single pose, with corresponding camera parameter. Here's the code How I got the predicted 3d points.
# MODEL LOAD
config = load_config('./model/pretrained/learnable_triangulation_volumetric/human36m_vol_softmax.yaml')
model = VolumetricTriangulationNet(config)
#OUTPUT#
#Loading pretrained weights from: ./model/pretrained/resnet/pose_resnet_4.5_pixels_human36m.pth
#Reiniting final layer filters: module.final_layer.weight
#Reiniting final layer biases: module.final_layer.bias
#Successfully loaded pretrained weights for backbone
# PROCESSING MODEL INPUT
annotations_path = ['./data/anno/16-1_001-C01_3D.json', './data/anno/16-1_001-C02_3D.json', './data/anno/16-1_001-C03_3D.json', './data/anno/16-1_001-C04_3D.json']
device = torch.device('cpu')
batch_keypoints_3d = []
cameras = []
for path in annotations_path:
_, _, _, keypoints_3d, camera = process_annotation_json(path)
batch_keypoints_3d.append(keypoints_3d)
cameras.append(camera)
batch = {'cameras' : cameras, 'pred_keypoints_3d' : batch_keypoints_3d}
images_batch = []
images_batch = process_images_batch(np.array(images_data))
proj_matricies_batch = torch.stack([torch.stack([torch.from_numpy(cam.projection) for cam in c])for c in cameras])
proj_matricies_batch = proj_matricies_batch.float().to(device)
# FORWARD MODEL
keypoints_3d_pred, heatmaps_pred, volumes_pred, confidences_pred, cuboids_pred, coord_volumes_pred, base_points_pred = model(images_batch, proj_matricies_batch, batch)
I found that I didn't load the pretrained weights, so I added the code like this : before :
# MODEL LOAD
config = load_config('./model/pretrained/learnable_triangulation_volumetric/human36m_vol_softmax.yaml')
model = VolumetricTriangulationNet(config)
after :
# MODEL LOAD
config = load_config('./model/pretrained/learnable_triangulation_volumetric/human36m_vol_softmax.yaml')
model = VolumetricTriangulationNet(config)
if config.model.init_weights:
state_dict = torch.load(config.model.checkpoint)
for key in list(state_dict.keys()):
new_key = key.replace("module.", "")
state_dict[new_key] = state_dict.pop(key)
model.load_state_dict(state_dict, strict=True)
print("Successfully loaded pretrained weights for whole model")
I've used the code in train.py
.
And the result is still quite the same as before..
What I suspect is that the keypoints_3d_pred
has different coordinate scale with the human36m gt data.
Hope I could get any help on how I should process my 3d points ground truth.
Hi, @karfly Thanks for sharing this great repo. I've trained the model using human3.6 dataset. After that, I use 2D heatmaps of other images to unproject with my own carlibration and feed to the trained model. But I find result like this: Is this the 3D pose? And I think maybe this result is in a different coordinate system. If yes, how can I get the corresponding poses in my images' coordinate system?