karfly / learnable-triangulation-pytorch

This repository is an official PyTorch implementation of the paper "Learnable Triangulation of Human Pose" (ICCV 2019, oral). Proposed method archives state-of-the-art results in multi-view 3D human pose estimation!
MIT License
1.09k stars 181 forks source link

Cannot run eval on pretrained models, instructions unclear #124

Closed Kaszanas closed 3 years ago

Kaszanas commented 3 years ago

Hello,

I am attempting to recreate steps that are documented in order to test this model. Running the following command that is specified in the documentation:

D:\Projects\SportAnalytics\src\learnable-triangulation-pytorch> python train.py --eval --eval_dataset val --config experiments/human36m/eval/human36m_vol_softmax.yaml --logdir ./logs

Returns the following error:

args: Namespace(config='experiments/human36m/eval/human36m_vol_softmax.yaml', eval=True, eval_dataset='val', local_rank=None, logdir='./logs', seed=42)
Number of available GPUs: 1
Loading pretrained weights from: ./data/pretrained/human36m/pose_resnet_4.5_pixels_human36m.pth
Reiniting final layer filters: module.final_layer.weight
Reiniting final layer biases: module.final_layer.bias
Successfully loaded pretrained weights for backbone
Successfully loaded pretrained weights for whole model
Loading data...
Traceback (most recent call last):
  File "train.py", line 483, in <module>
    main(args)
  File "train.py", line 444, in main
    train_dataloader, val_dataloader, train_sampler = setup_dataloaders(config, distributed_train=is_distributed)
  File "train.py", line 117, in setup_dataloaders
    train_dataloader, val_dataloader, train_sampler = setup_human36m_dataloaders(config, is_train, distributed_train)
  File "train.py", line 65, in setup_human36m_dataloaders
    crop=config.dataset.train.crop if hasattr(config.dataset.train, "crop") else True,
  File "D:\src\learnable-triangulation-pytorch\mvn\datasets\human36m.py", line 70, in __init__
    self.labels = np.load(labels_path, allow_pickle=True).item()
  File "D:\Envs\Pose\lib\site-packages\numpy\lib\npyio.py", line 417, in load
    fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: './data/human36m/extra/human36m-multiview-labels-GTbboxes.npy'

This seems odd as I am unable to find the specified file within the shared files on Google Drive to be able to run the model evaluation.

If You have any solutions to this problem please let me know.

shrubb commented 3 years ago

Hi, have a look at these instructions. Follow them carefully and you will get that file.

Kaszanas commented 3 years ago

Thank You very much for response @shrubb I am closing this issue and if I will fail at the verbose instructions I will re-open it or message in it being closed.

Kaszanas commented 3 years ago

@shrubb I have noticed that I am only after testing inference on the pretrained model that was uploaded into Google Drive at this point. But it seems that Your training system is complicated and there is no documentation about how to properly perform inference over media input: photos or videos.

Could You provide instructions on how to use the model?

My attempts at loading in the pretrained weights with the following code:

import torch
from mvn.models.triangulation import VolumetricTriangulationNet
from mvn.utils import cfg

config = cfg.load_config("./experiments/human36m/eval/human36m_vol_softmax.yaml")

model = VolumetricTriangulationNet(config)
model.load_state_dict(torch.load("./data/pretrained/human36m/human36m_vol_softmax_10-08-2019/checkpoints/0040/weights.pth"))
model.eval()

Result in the following Error being thrown:

python .\inference.py
Loading pretrained weights from: ./data/pretrained/human36m/pose_resnet_4.5_pixels_human36m.pth
Reiniting final layer filters: module.final_layer.weight
Reiniting final layer biases: module.final_layer.bias
Successfully loaded pretrained weights for backbone
Traceback (most recent call last):
  File ".\inference.py", line 9, in <module>
    model.load_state_dict(torch.load("D:/Projects/SportAnalytics/src/learnable-triangulation-pytorch/data/pretrained/human36m/human36m_vol_softmax_10-08-2019/checkpoints/0040/weights.pth"))
  File "D:\Envs\SportAnalytics\lib\site-packages\torch\nn\modules\module.py", line 1045, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for VolumetricTriangulationNet:
        Missing key(s) in state_dict: "backbone.conv1.weight", [...]

I have spent a good couple of days trying to figure out how Your code works and if there is a built in way of performing inference on any media file but I can't seem to wrap my head around that @shrubb I would really appreciate if You could support me.

shrubb commented 3 years ago

Hi, have you managed to run validation on Human3.6M? I mean the command from the original post, python train.py --eval --eval_dataset val ...

If yes, then check how train.py loads the model (I personally don't know this, never worked with that piece of code) and then do the same in your script. You can check it by, for example, running with python -m pdb instead of python.

Kaszanas commented 3 years ago

@shrubb

I have made an account to access the dataset some time ago but I have not received any information. So my attempts at validation are currently blocked.

In the meantime I would like to try and run inference on images / video, both monocular and with multiple camera inputs to see the pretrained model in action and to try and export the 3D information out of it.