About visualize the recognition

open-mmlab / mmskeleton

A OpenMMLAB toolbox for human pose estimation, skeleton-based action recognition, and action synthesis.

Apache License 2.0

2.93k stars 1.04k forks source link

About visualize the recognition #356

Open wingskh opened 4 years ago

wingskh commented 4 years ago

I have trained a model successfully based on custom video. However, when I tried to run deprecated/origin_stgcn_repo/main.py to show the visualization result, it showed an error message: Traceback (most recent call last): File "main.py", line 32, in p = Processor(sys.argv[2:]) File "/home/user_name/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/io.py", line 32, in init self.load_weights() File "/home/user_name/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/io.py", line 84, in load_weights self.arg.ignore_weights) File "/opt/conda/envs/open-mmlab/lib/python3.7/site-packages/torchlight-1.0-py3.7.egg/torchlight/io.py", line 66, in load_weights SEEK_CUR = 1 File "/opt/conda/envs/open-mmlab/lib/python3.7/site-packages/torchlight-1.0-py3.7.egg/torchlight/io.py", line 66, in SEEK_CUR = 1 AttributeError: 'dict' object has no attribute 'cpu'

How can I get a visualized recognition video like the demo? Thank!

BozhouZha commented 4 years ago

Are you training your model based on mmskeleton method and getting a .pth weight file? This model doesn't comply with the format which the demo program under deprecated folder uses. I converted that into the former .pt format and it works.

wingskh commented 4 years ago

Are you training your model based on mmskeleton method and getting a .pth weight file? This model doesn't comply with the format which the demo program under deprecated folder uses. I converted that into the former .pt format and it works.

Thanks for your reponse. Yes, I am using the .pth file! How to convert it into .pt format? If I just rename the .pth to .pt, the error still occurs. If I follow this solution: https://github.com/open-mmlab/mmskeleton/issues/291, there is an error due to the mismatch of the number of keypoints.

BozhouZha commented 4 years ago

Yes, I am using the .pth file! How to convert it into .pt format?

Well, the only difference between the new .pth file and the .pt file is they added some meta-info of the video into it. I wrote a piece of code to do the conversion. It may be not elegant but works:

# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)

Add this piece of code into the demo_offline.py file, after arg = parser.parse_args() before Processor = processors[arg.processor]

This is how it works: these lines takes the ".pth" file that you generated, extract "state_dict" part and save as new ".pt". The new .pt file follows the same convention that old demo anticipates. ".pt" file is stored at the same folder where your ".pth" lies, and the ".pth" files won't be deleted by this piece of code. Hope this helps.

wingskh commented 4 years ago

Are you training your model based on mmskeleton method and getting a .pth weight file? This model doesn't comply with the format which the demo program under deprecated folder uses. I converted that into the former .pt format and it works.

Yes, I am using the .pth file! How to convert it into .pt format?

Well, the only difference between the new .pth file and the .pt file is they added some meta-info of the video into it. I wrote a piece of code to do the conversion. It may be not elegant but works:
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
Add this piece of code into the demo_offline.py file, after arg = parser.parse_args() before Processor = processors[arg.processor]

This is how it works: these lines takes the ".pth" file that you generated, extract "state_dict" part and save as new ".pt". The new .pt file follows the same convention that old demo anticipates. ".pt" file is stored at the same folder where your ".pth" lies, and the ".pth" files won't be deleted by this piece of code. Hope this helps.

Really appreciate your great help! The original problem is solved! However, there is another error occurred. Since I use mmskeleton to get the 17 keypoints for my custom videos, the number of keypoints do not match to the model which required 18 keypoints. Do you know how to solve this problem?

RuntimeError: Error(s) in loading state_dict for Model:
    size mismatch for A: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for data_bn.weight: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_mean: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_var: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for edge_importance.0: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.1: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.2: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.3: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.4: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.5: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.6: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.7: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.8: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.9: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for fcn.weight: copying a param with shape torch.Size([2, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([400, 256, 1, 1]).
    size mismatch for fcn.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([400]).

BozhouZha commented 4 years ago

RuntimeError: Error(s) in loading state_dict for Model:
  size mismatch for A: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for data_bn.weight: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
  size mismatch for data_bn.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
  size mismatch for data_bn.running_mean: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
  size mismatch for data_bn.running_var: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
  size mismatch for edge_importance.0: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.1: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.2: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.3: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.4: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.5: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.6: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.7: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.8: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.9: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for fcn.weight: copying a param with shape torch.Size([2, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([400, 256, 1, 1]).
  size mismatch for fcn.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([400]).

It's because the mmskeleton trains a model which has a format of COCO dataset. COCO dataset only contains 17 points. However, the demo program is used for the input from openpose, which has 18 keypoints.

In "deprecated\origin_stgcn_repo\config\st_gcn\kinetics-skeleton\demo_offline.yaml", there is a blank called "graph_args->layout", change that from openpose to "coco", then you should be all set.

Besides: In the demo_offline.py under deprecated's processor folder, there is a class called naive_pose_tracker(). Change it's initialization parameter "num_joint = 18" to 17.

wingskh commented 4 years ago

Are you training your model based on mmskeleton method and getting a .pth weight file? This model doesn't comply with the format which the demo program under deprecated folder uses. I converted that into the former .pt format and it works.

Yes, I am using the .pth file! How to convert it into .pt format?

Well, the only difference between the new .pth file and the .pt file is they added some meta-info of the video into it. I wrote a piece of code to do the conversion. It may be not elegant but works:
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
Add this piece of code into the demo_offline.py file, after arg = parser.parse_args() before Processor = processors[arg.processor] This is how it works: these lines takes the ".pth" file that you generated, extract "state_dict" part and save as new ".pt". The new .pt file follows the same convention that old demo anticipates. ".pt" file is stored at the same folder where your ".pth" lies, and the ".pth" files won't be deleted by this piece of code. Hope this helps.
Really appreciate your great help! The original problem is solved! However, there is another error occurred. Since I use mmskeleton to get the 17 keypoints for my custom videos, the number of keypoints do not match to the model which required 18 keypoints. Do you know how to solve this problem?
RuntimeError: Error(s) in loading state_dict for Model:
    size mismatch for A: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for data_bn.weight: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_mean: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_var: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for edge_importance.0: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.1: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.2: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.3: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.4: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.5: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.6: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.7: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.8: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.9: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for fcn.weight: copying a param with shape torch.Size([2, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([400, 256, 1, 1]).
    size mismatch for fcn.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([400]).
It's because the mmskeleton trains a model which has a format of COCO dataset. COCO dataset only contains 17 points. However, the demo program is used for the input from openpose, which has 18 keypoints.

In "deprecated\origin_stgcn_repo\config\st_gcn\kinetics-skeleton\demo_offline.yaml", there is a blank called "graph_args->layout", change that from openpose to "coco", then you should be all set.

Besides: In the demo_offline.py under deprecated's processor folder, there is a class called naive_pose_tracker(). Change it's initialization parameter "num_joint = 18" to 17.

There is the last problem about changing the number of keypoints. Although I had changed the initialization parameter "num_joint = 18" to 17, there is sill an error. Do you have any idea to solve it?

Traceback (most recent call last):
  File "main.py", line 48, in <module>
    p.start()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 32, in start
    video, data_numpy = self.pose_estimation()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 155, in pose_estimation
    data_numpy = pose_tracker.get_skeleton_sequence()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 267, in get_skeleton_sequence
    data[:, beg:end, :, trace_index] = d.transpose((2, 0, 1))
ValueError: could not broadcast input array from shape (3,5441,18) into shape (3,5441,17)

BozhouZha commented 4 years ago

There is the last problem about changing the number of keypoints. Do you have any idea to solve it?

Traceback (most recent call last):
  File "main.py", line 48, in <module>
    p.start()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 32, in start
    video, data_numpy = self.pose_estimation()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 155, in pose_estimation
    data_numpy = pose_tracker.get_skeleton_sequence()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 267, in get_skeleton_sequence
    data[:, beg:end, :, trace_index] = d.transpose((2, 0, 1))
ValueError: could not broadcast input array from shape (3,5441,18) into shape (3,5441,17)

It suddenly hit me that you are using openpose API right? If so, in the "pose_estimation" function of demo_offline.py, there is a variable called multi-pose, which is the skeleton results from Openpose. That's to say, "multi_pose" is in the format of Openpose as well. As in the annotation, the "multi_pose" consists of (num_person, num_joint, 3), where 3 means (x, y, confidence). You need to convert and reshape "multi_pose" into COCO format as well.

This is not hard, simply delete a joint point (I remember is the middle point between 2 shoulders), and re-index those joints.

[Following is cited from https://blog.csdn.net/u011291667/]

Here is COCO arrangement:

Here is openpose arrangement:

If error still occur, make sure the num_joint (# of keypoints) are 17 (as you trained by mmskeleton, which is COCO format), and make sure the graph_cfg in config file are "coco".

wingskh commented 4 years ago

Are you training your model based on mmskeleton method and getting a .pth weight file? This model doesn't comply with the format which the demo program under deprecated folder uses. I converted that into the former .pt format and it works.

Yes, I am using the .pth file! How to convert it into .pt format?

Well, the only difference between the new .pth file and the .pt file is they added some meta-info of the video into it. I wrote a piece of code to do the conversion. It may be not elegant but works:
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
Add this piece of code into the demo_offline.py file, after arg = parser.parse_args() before Processor = processors[arg.processor] This is how it works: these lines takes the ".pth" file that you generated, extract "state_dict" part and save as new ".pt". The new .pt file follows the same convention that old demo anticipates. ".pt" file is stored at the same folder where your ".pth" lies, and the ".pth" files won't be deleted by this piece of code. Hope this helps.
Really appreciate your great help! The original problem is solved! However, there is another error occurred. Since I use mmskeleton to get the 17 keypoints for my custom videos, the number of keypoints do not match to the model which required 18 keypoints. Do you know how to solve this problem?
RuntimeError: Error(s) in loading state_dict for Model:
    size mismatch for A: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for data_bn.weight: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_mean: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_var: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for edge_importance.0: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.1: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.2: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.3: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.4: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.5: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.6: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.7: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.8: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.9: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for fcn.weight: copying a param with shape torch.Size([2, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([400, 256, 1, 1]).
    size mismatch for fcn.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([400]).
It's because the mmskeleton trains a model which has a format of COCO dataset. COCO dataset only contains 17 points. However, the demo program is used for the input from openpose, which has 18 keypoints. In "deprecated\origin_stgcn_repo\config\st_gcn\kinetics-skeleton\demo_offline.yaml", there is a blank called "graph_args->layout", change that from openpose to "coco", then you should be all set. Besides: In the demo_offline.py under deprecated's processor folder, there is a class called naive_pose_tracker(). Change it's initialization parameter "num_joint = 18" to 17.
There is the last problem about changing the number of keypoints. Do you have any idea to solve it?
Traceback (most recent call last):
  File "main.py", line 48, in <module>
    p.start()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 32, in start
    video, data_numpy = self.pose_estimation()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 155, in pose_estimation
    data_numpy = pose_tracker.get_skeleton_sequence()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 267, in get_skeleton_sequence
    data[:, beg:end, :, trace_index] = d.transpose((2, 0, 1))
ValueError: could not broadcast input array from shape (3,5441,18) into shape (3,5441,17)
It suddenly hit me that you are using openpose API right? If so, in the "pose_estimation" function of demo_offline.py, there is a variable called multi-pose, which is the skeleton results from Openpose. That's to say, "multi_pose" is in the format of Openpose as well. As in the annotation, the "multi_pose" consists of (num_person, num_joint, 3), where 3 means (x, y, confidence). You need to convert and reshape "multi_pose" into COCO format as well.

This is not hard, simply delete a joint point (I remember is the middle point between 2 shoulders), and re-index those joints.

Here is COCO arrangement:

Here is openpose arrangement:

If error still occur, make sure the num_joint (# of keypoints) are 17 (as you trained by mmskeleton, which is COCO format), and make sure the graph_cfg in config file are "coco".

Thanks! I think the program can execute the function successfully. However, since my remoted server does not have GUI. There is still an error:

Fontconfig warning: "/etc/fonts/fonts.conf", line 100: unknown element "blank"
: cannot connect to X server

Could I just output the demo video and not display it?

BozhouZha commented 4 years ago

Thanks! I think the program can execute the function successfully. However, since my remoted server does not have GUI. There is still an error:
Fontconfig warning: "/etc/fonts/fonts.conf", line 100: unknown element "blank"
: cannot connect to X server 
Could I just output the demo video and not display it?

May I know what did you modify to make it execute successfully? Since I ran into the same issue and I have no choice but to modify the output of the openpose API.

To solve your last problem, go to "start" function of Demo_offline class, comment out the imshow loop and add videowrite functions.

wingskh commented 4 years ago

Are you training your model based on mmskeleton method and getting a .pth weight file? This model doesn't comply with the format which the demo program under deprecated folder uses. I converted that into the former .pt format and it works.

Yes, I am using the .pth file! How to convert it into .pt format?

Well, the only difference between the new .pth file and the .pt file is they added some meta-info of the video into it. I wrote a piece of code to do the conversion. It may be not elegant but works:
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
Add this piece of code into the demo_offline.py file, after arg = parser.parse_args() before Processor = processors[arg.processor] This is how it works: these lines takes the ".pth" file that you generated, extract "state_dict" part and save as new ".pt". The new .pt file follows the same convention that old demo anticipates. ".pt" file is stored at the same folder where your ".pth" lies, and the ".pth" files won't be deleted by this piece of code. Hope this helps.
Really appreciate your great help! The original problem is solved! However, there is another error occurred. Since I use mmskeleton to get the 17 keypoints for my custom videos, the number of keypoints do not match to the model which required 18 keypoints. Do you know how to solve this problem?
RuntimeError: Error(s) in loading state_dict for Model:
    size mismatch for A: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for data_bn.weight: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_mean: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_var: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for edge_importance.0: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.1: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.2: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.3: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.4: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.5: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.6: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.7: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.8: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.9: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for fcn.weight: copying a param with shape torch.Size([2, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([400, 256, 1, 1]).
    size mismatch for fcn.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([400]).
It's because the mmskeleton trains a model which has a format of COCO dataset. COCO dataset only contains 17 points. However, the demo program is used for the input from openpose, which has 18 keypoints. In "deprecated\origin_stgcn_repo\config\st_gcn\kinetics-skeleton\demo_offline.yaml", there is a blank called "graph_args->layout", change that from openpose to "coco", then you should be all set. Besides: In the demo_offline.py under deprecated's processor folder, there is a class called naive_pose_tracker(). Change it's initialization parameter "num_joint = 18" to 17.
There is the last problem about changing the number of keypoints. Do you have any idea to solve it?
Traceback (most recent call last):
  File "main.py", line 48, in <module>
    p.start()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 32, in start
    video, data_numpy = self.pose_estimation()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 155, in pose_estimation
    data_numpy = pose_tracker.get_skeleton_sequence()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 267, in get_skeleton_sequence
    data[:, beg:end, :, trace_index] = d.transpose((2, 0, 1))
ValueError: could not broadcast input array from shape (3,5441,18) into shape (3,5441,17)
It suddenly hit me that you are using openpose API right? If so, in the "pose_estimation" function of demo_offline.py, there is a variable called multi-pose, which is the skeleton results from Openpose. That's to say, "multi_pose" is in the format of Openpose as well. As in the annotation, the "multi_pose" consists of (num_person, num_joint, 3), where 3 means (x, y, confidence). You need to convert and reshape "multi_pose" into COCO format as well. This is not hard, simply delete a joint point (I remember is the middle point between 2 shoulders), and re-index those joints. Here is COCO arrangement: Here is openpose arrangement: If error still occur, make sure the num_joint (# of keypoints) are 17 (as you trained by mmskeleton, which is COCO format), and make sure the graph_cfg in config file are "coco".
Thanks! I think the program can execute the function successfully. However, since my remoted server does not have GUI. There is still an error:
Fontconfig warning: "/etc/fonts/fonts.conf", line 100: unknown element "blank"
: cannot connect to X server 
Could I just output the demo video and not display it?
May I know what did you modify to make it execute successfully? Since I ran into the same issue and I have no choice but to modify the output of the openpose API.

To solve your last problem, go to "start" function of Demo_offline class, comment out the imshow loop and add videowrite functions.

Yes, I have followed your step to modify the muli_pose to make it execute successfully. By the way, if this is a correct videowrite function?

out = cv2.VideoWriter('output.avi', -1, 30.0, (640,480))
        for image in images:
            image = image.astype(np.uint8)
            out.write(image)
            # cv2.imshow("ST-GCN", image)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break

BozhouZha commented 4 years ago

Yes, I have followed your step to modify the muli_pose to make it execute successfully. By the way, if this is a correct videowrite function?
out = cv2.VideoWriter('output.avi', -1, 30.0, (640,480))
        for image in images:
            image = image.astype(np.uint8)
            out.write(image)
            # cv2.imshow("ST-GCN", image)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break

I used enumerate images,

video_writer.write(image, ind)

But I think you should be fine if you omit the index. Can you run it?

wingskh commented 4 years ago

Are you training your model based on mmskeleton method and getting a .pth weight file? This model doesn't comply with the format which the demo program under deprecated folder uses. I converted that into the former .pt format and it works.

Yes, I am using the .pth file! How to convert it into .pt format?

Well, the only difference between the new .pth file and the .pt file is they added some meta-info of the video into it. I wrote a piece of code to do the conversion. It may be not elegant but works:
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
Add this piece of code into the demo_offline.py file, after arg = parser.parse_args() before Processor = processors[arg.processor] This is how it works: these lines takes the ".pth" file that you generated, extract "state_dict" part and save as new ".pt". The new .pt file follows the same convention that old demo anticipates. ".pt" file is stored at the same folder where your ".pth" lies, and the ".pth" files won't be deleted by this piece of code. Hope this helps.
Really appreciate your great help! The original problem is solved! However, there is another error occurred. Since I use mmskeleton to get the 17 keypoints for my custom videos, the number of keypoints do not match to the model which required 18 keypoints. Do you know how to solve this problem?
RuntimeError: Error(s) in loading state_dict for Model:
    size mismatch for A: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for data_bn.weight: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_mean: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_var: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for edge_importance.0: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.1: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.2: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.3: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.4: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.5: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.6: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.7: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.8: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.9: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for fcn.weight: copying a param with shape torch.Size([2, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([400, 256, 1, 1]).
    size mismatch for fcn.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([400]).
It's because the mmskeleton trains a model which has a format of COCO dataset. COCO dataset only contains 17 points. However, the demo program is used for the input from openpose, which has 18 keypoints. In "deprecated\origin_stgcn_repo\config\st_gcn\kinetics-skeleton\demo_offline.yaml", there is a blank called "graph_args->layout", change that from openpose to "coco", then you should be all set. Besides: In the demo_offline.py under deprecated's processor folder, there is a class called naive_pose_tracker(). Change it's initialization parameter "num_joint = 18" to 17.
There is the last problem about changing the number of keypoints. Do you have any idea to solve it?
Traceback (most recent call last):
  File "main.py", line 48, in <module>
    p.start()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 32, in start
    video, data_numpy = self.pose_estimation()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 155, in pose_estimation
    data_numpy = pose_tracker.get_skeleton_sequence()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 267, in get_skeleton_sequence
    data[:, beg:end, :, trace_index] = d.transpose((2, 0, 1))
ValueError: could not broadcast input array from shape (3,5441,18) into shape (3,5441,17)
It suddenly hit me that you are using openpose API right? If so, in the "pose_estimation" function of demo_offline.py, there is a variable called multi-pose, which is the skeleton results from Openpose. That's to say, "multi_pose" is in the format of Openpose as well. As in the annotation, the "multi_pose" consists of (num_person, num_joint, 3), where 3 means (x, y, confidence). You need to convert and reshape "multi_pose" into COCO format as well. This is not hard, simply delete a joint point (I remember is the middle point between 2 shoulders), and re-index those joints. Here is COCO arrangement: Here is openpose arrangement: If error still occur, make sure the num_joint (# of keypoints) are 17 (as you trained by mmskeleton, which is COCO format), and make sure the graph_cfg in config file are "coco".
Thanks! I think the program can execute the function successfully. However, since my remoted server does not have GUI. There is still an error:
Fontconfig warning: "/etc/fonts/fonts.conf", line 100: unknown element "blank"
: cannot connect to X server 
Could I just output the demo video and not display it?
May I know what did you modify to make it execute successfully? Since I ran into the same issue and I have no choice but to modify the output of the openpose API. To solve your last problem, go to "start" function of Demo_offline class, comment out the imshow loop and add videowrite functions.
Yes, I have followed your step to modify the muli_pose to make it execute successfully. By the way, if this is a correct videowrite function?
out = cv2.VideoWriter('output.avi', -1, 30.0, (640,480))
        for image in images:
            image = image.astype(np.uint8)
            out.write(image)
            # cv2.imshow("ST-GCN", image)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
I used enumerate images,

video_writer.write(image, ind)

But I think you should be fine if you omit the index. Can you run it?

Can you send me the code of writing a video? The code above cannot write a video successfully.

BozhouZha commented 4 years ago

Can you send me the code of writing a video? The code above cannot write a video successfully.

Try to initiate a videowriter class this way:

cv2.VideoWriter(self.dst_path, cv2.VideoWriter_fourcc('M','J','P','G'), fps, (w, h))

where fps stands for the frame rates, (w, h) stands for the image width and height. I'm sorry that I can't send you the codes for that part which I didn't have the credit. You are already pretty close though.

wingskh commented 4 years ago

Try to initiate a videowriter class this way:

cv2.VideoWriter(self.dst_path, cv2.VideoWriter_fourcc('M','J','P','G'), fps, (w, h))

where fps stands for the frame rates, (w, h) stands for the image width and height. I'm sorry that I can't send you the codes for that part which I didn't have the credit. You are already pretty close though.

You already helped me a lot! I tried to initialize the videowriter with cv2.VideoWriter(self.dst_path, cv2.VideoWriter_fourcc('M','J','P','G'), fps, (w, h)) However, the video still cannot be opened and the size is really small. Are the fps and (w, h) as same as the input video or the training video?

BozhouZha commented 4 years ago

What’s the error prompts?

wingskh commented 4 years ago

What’s the error prompts?

Oh, I solved it. The error occurred because of the wrong width and height. I got stuck on these problems for about a week and you helped me solve them with a day! Words can't express my gratitude! Really appreciate your great help!

Sorry that I still have 2 more questions. Do you have an experiment on modifying the training video width and height? Currently, I am still using the original build_dataset_example.yaml to build my custom dataset. Thus, the video's height and width will be converted to 192 and 256. I think it may influence the performance of the action recognition result so I am considering modifying this file.

Also, in my video, there are 2 or 3 people. Could I choose to use the skeleton of the person who has the biggest image or closest to the camera?

BozhouZha commented 4 years ago

Hi there, I'm terribly sorry for my late responce, since I didn't work on it in last 2 days.

For the video size, If you mean the feeding of build_dataset, I didn't change the size of my video input though my sources are not uniform. I kept the 192x256 as it is in the config file and it works. If you mean the size in rending video part of demo_offline.start(), I chose the size to be 1920x1080. Yes, I hardcoded this dimension since I suppose the size won't influence the performance much, it's simply a visualization which will be overlapped with some human box, action label and etc..

The last one is a great question. I'm also considering adding constraints to the person in the generated json file before training. Because as you know, there might be multiple person in the video but each video only has a single label. This means the model will associate the action with many irrelavant people if your input video contains more than 1 person or more than one action. I wish I get you right and you are worried about the same thing. If so, I think we can have a further discuss.

wingskh commented 4 years ago

Hi there, I'm terribly sorry for my late responce, since I didn't work on it in last 2 days.

For the video size, If you mean the feeding of build_dataset, I didn't change the size of my video input though my sources are not uniform. I kept the 192x256 as it is in the config file and it works. If you mean the size in rending video part of demo_offline.start(), I chose the size to be 1920x1080. Yes, I hardcoded this dimension since I suppose the size won't influence the performance much, it's simply a visualization which will be overlapped with some human box, action label and etc..

The last one is a great question. I'm also considering adding constraints to the person in the generated json file before training. Because as you know, there might be multiple person in the video but each video only has a single label. This means the model will associate the action with many irrelavant people if your input video contains more than 1 person or more than one action. I wish I get you right and you are worried about the same thing. If so, I think we can have a further discuss.

Thanks for your reply!

For the video size, what I mean is the feeding of build_dataset. As I want to maximize the performance, the video details are as much as possible. Thus, I want to use videos with bigger scale to train the model.

For the second question, you got what I meant exactly! Adding the constraints is a really great idea. I think we may use the skeleton with the longest sum of edges if we want to use the person who has biggest image. I will try to work on this method first and share my experience to you

BozhouZha commented 4 years ago

About the first question, here is my thought: the human detection task is relative mature already. Both the detection (detect human) and estimation (find the skeleton) did a quite great job. As you can see in the demo video, the st-gcn input at up-right window, which is the skeleton, is really satisfying. Adding the video details or improving the input video resolution will not make a significant change to the performance of st-gcn since the input of st-gcn is simply skeleton node graphs and it's good enough.

For the second question, our ultimate goal is the same but my raw input annotation is a little bit different, so I need to adopt a different strategy other than finding the "largest" person.

Since this discussion seems already going beyond the scope of mmskeleton core part, to prevent further interrupting the author or distracting other followers' attention, would you mind leaving your email and continuing over email? We can update and post our agreement here later if other guys are also interested in this thread as well.

wingskh commented 4 years ago

About the first question, here is my thought: the human detection task is relative mature already. Both the detection (detect human) and estimation (find the skeleton) did a quite great job. As you can see in the demo video, the st-gcn input at up-right window, which is the skeleton, is really satisfying. Adding the video details or improving the input video resolution will not make a significant change to the performance of st-gcn since the input of st-gcn is simply skeleton node graphs and it's good enough.

For the second question, our ultimate goal is the same but my raw input annotation is a little bit different, so I need to adopt a different strategy other than finding the "largest" person.

Since this discussion seems already going beyond the scope of mmskeleton core part, to prevent further interrupting the author or distracting other followers' attention, would you mind leaving your email and continuing over email? We can update and post our agreement here later if other guys are also interested in this thread as well.

Sure, my email address is wingsunkh@gmail.com. Thanks for your help

BozhouZha commented 4 years ago

You're welcome! Message sent to you email.

2795449476 commented 3 years ago

是的，我正在使用.pth文件！如何将其转换为.pt格式？

好吧，新的.pth文件和.pt文件之间的唯一区别是，他们在其中添加了视频的一些元信息。我写了一段代码来进行转换。它可能并不优雅，但可以工作：
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
在 arg = parser.parse_args() 之前之后，将这段代码添加到demo_offline.py文件中 Processor = processors[arg.processor]

它是这样工作的：这些行采用您生成的“ .pth”文件，提取“ state_dict”部分并另存为新的“ .pt”。新的.pt文件遵循旧演示所预期的相同约定。“ .pt”文件存储在“ .pth”所在的文件夹中，因此这些代码不会删除“ .pth”文件。希望这可以帮助。

Excuse me，I'm sorry I didn't find it in demo_offline.py, Are you sure it's right？

pranavgundewar commented 3 years ago

是的，我正在使用.pth文件！如何将其转换为.pt格式？

好吧，新的.pth文件和.pt文件之间的唯一区别是，他们在其中添加了视频的一些元信息。我写了一段代码来进行转换。它可能并不优雅，但可以工作：
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
在 arg = parser.parse_args() 之前之后，将这段代码添加到demo_offline.py文件中 Processor = processors[arg.processor] 它是这样工作的：这些行采用您生成的“ .pth”文件，提取“ state_dict”部分并另存为新的“ .pt”。新的.pt文件遵循旧演示所预期的相同约定。“ .pt”文件存储在“ .pth”所在的文件夹中，因此这些代码不会删除“ .pth”文件。希望这可以帮助。
Excuse me，I'm sorry I didn't find it in demo_offline.py, Are you sure it's right？

It is in main.py script

SKBL5694 commented 3 years ago

@wingskh @BozhouZha First, thx you two for the discussion above. It gives me a lot of help. Because I only started researching on this project recently, I am not very familiar with the improvement plan. If it is convinent, can I join your discussion, or I would like to say that may I get some guidance from your discussion, just like what you discussed above.

zren2 commented 3 years ago

Are you training your model based on mmskeleton method and getting a .pth weight file? This model doesn't comply with the format which the demo program under deprecated folder uses. I converted that into the former .pt format and it works.

Yes, I am using the .pth file! How to convert it into .pt format?

Well, the only difference between the new .pth file and the .pt file is they added some meta-info of the video into it. I wrote a piece of code to do the conversion. It may be not elegant but works:
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
Add this piece of code into the demo_offline.py file, after arg = parser.parse_args() before Processor = processors[arg.processor] This is how it works: these lines takes the ".pth" file that you generated, extract "state_dict" part and save as new ".pt". The new .pt file follows the same convention that old demo anticipates. ".pt" file is stored at the same folder where your ".pth" lies, and the ".pth" files won't be deleted by this piece of code. Hope this helps.
Really appreciate your great help! The original problem is solved! However, there is another error occurred. Since I use mmskeleton to get the 17 keypoints for my custom videos, the number of keypoints do not match to the model which required 18 keypoints. Do you know how to solve this problem?
RuntimeError: Error(s) in loading state_dict for Model:
  size mismatch for A: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for data_bn.weight: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
  size mismatch for data_bn.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
  size mismatch for data_bn.running_mean: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
  size mismatch for data_bn.running_var: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
  size mismatch for edge_importance.0: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.1: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.2: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.3: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.4: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.5: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.6: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.7: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.8: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for edge_importance.9: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
  size mismatch for fcn.weight: copying a param with shape torch.Size([2, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([400, 256, 1, 1]).
  size mismatch for fcn.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([400]).
It's because the mmskeleton trains a model which has a format of COCO dataset. COCO dataset only contains 17 points. However, the demo program is used for the input from openpose, which has 18 keypoints. In "deprecated\origin_stgcn_repo\config\st_gcn\kinetics-skeleton\demo_offline.yaml", there is a blank called "graph_args->layout", change that from openpose to "coco", then you should be all set. Besides: In the demo_offline.py under deprecated's processor folder, there is a class called naive_pose_tracker(). Change it's initialization parameter "num_joint = 18" to 17.
There is the last problem about changing the number of keypoints. Although I had changed the initialization parameter "num_joint = 18" to 17, there is sill an error. Do you have any idea to solve it?
Traceback (most recent call last):
  File "main.py", line 48, in <module>
    p.start()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 32, in start
    video, data_numpy = self.pose_estimation()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 155, in pose_estimation
    data_numpy = pose_tracker.get_skeleton_sequence()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 267, in get_skeleton_sequence
    data[:, beg:end, :, trace_index] = d.transpose((2, 0, 1))
ValueError: could not broadcast input array from shape (3,5441,18) into shape (3,5441,17)

I also get this issue, could you tell me how to resolve it?

zren2 commented 3 years ago

是的，我正在使用.pth文件！如何将其转换为.pt格式？

好吧，新的.pth文件和.pt文件之间的唯一区别是，他们在其中添加了视频的一些元信息。我写了一段代码来进行转换。它可能并不优雅，但可以工作：
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
在 arg = parser.parse_args() 之前之后，将这段代码添加到demo_offline.py文件中 Processor = processors[arg.processor] 它是这样工作的：这些行采用您生成的“ .pth”文件，提取“ state_dict”部分并另存为新的“ .pt”。新的.pt文件遵循旧演示所预期的相同约定。“ .pt”文件存储在“ .pth”所在的文件夹中，因此这些代码不会删除“ .pth”文件。希望这可以帮助。
Excuse me，I'm sorry I didn't find it in demo_offline.py, Are you sure it's right？

It's in main.py

Fanthers commented 3 years ago

Are you training your model based on mmskeleton method and getting a .pth weight file? This model doesn't comply with the format which the demo program under deprecated folder uses. I converted that into the former .pt format and it works.

Yes, I am using the .pth file! How to convert it into .pt format?

Well, the only difference between the new .pth file and the .pt file is they added some meta-info of the video into it. I wrote a piece of code to do the conversion. It may be not elegant but works:
# Model Conversion
#  -- if the model is of a newer version, convert it to former layout to enable visualization
# Obtain the yaml configuration file
with open(arg.config, 'r') as f:
    default_arg = yaml.load(f, Loader=yaml.FullLoader)

raw_weight_path = str(default_arg['weights'])
raw_weight = torch.load(raw_weight_path)

if set(raw_weight.keys()) == set({'meta', 'optimizer', 'state_dict'}): # convention of new model
    converted_path = raw_weight_path.rsplit('.', maxsplit=1)[0] + '.pt'
    torch.save(raw_weight['state_dict'], converted_path)   # save the pt version of the model
    default_arg['weights'] = str(converted_path)

with open(arg.config, "w") as f:
    yaml.dump(default_arg, f)
Add this piece of code into the demo_offline.py file, after arg = parser.parse_args() before Processor = processors[arg.processor] This is how it works: these lines takes the ".pth" file that you generated, extract "state_dict" part and save as new ".pt". The new .pt file follows the same convention that old demo anticipates. ".pt" file is stored at the same folder where your ".pth" lies, and the ".pth" files won't be deleted by this piece of code. Hope this helps.
Really appreciate your great help! The original problem is solved! However, there is another error occurred. Since I use mmskeleton to get the 17 keypoints for my custom videos, the number of keypoints do not match to the model which required 18 keypoints. Do you know how to solve this problem?
RuntimeError: Error(s) in loading state_dict for Model:
    size mismatch for A: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for data_bn.weight: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_mean: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for data_bn.running_var: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([54]).
    size mismatch for edge_importance.0: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.1: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.2: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.3: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.4: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.5: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.6: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.7: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.8: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for edge_importance.9: copying a param with shape torch.Size([3, 17, 17]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).
    size mismatch for fcn.weight: copying a param with shape torch.Size([2, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([400, 256, 1, 1]).
    size mismatch for fcn.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([400]).
It's because the mmskeleton trains a model which has a format of COCO dataset. COCO dataset only contains 17 points. However, the demo program is used for the input from openpose, which has 18 keypoints. In "deprecated\origin_stgcn_repo\config\st_gcn\kinetics-skeleton\demo_offline.yaml", there is a blank called "graph_args->layout", change that from openpose to "coco", then you should be all set. Besides: In the demo_offline.py under deprecated's processor folder, there is a class called naive_pose_tracker(). Change it's initialization parameter "num_joint = 18" to 17.
There is the last problem about changing the number of keypoints. Although I had changed the initialization parameter "num_joint = 18" to 17, there is sill an error. Do you have any idea to solve it?
Traceback (most recent call last):
  File "main.py", line 48, in <module>
    p.start()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 32, in start
    video, data_numpy = self.pose_estimation()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 155, in pose_estimation
    data_numpy = pose_tracker.get_skeleton_sequence()
  File "/home/wing_mac/action_recognition/mmskeleton/deprecated/origin_stgcn_repo/processor/demo_offline.py", line 267, in get_skeleton_sequence
    data[:, beg:end, :, trace_index] = d.transpose((2, 0, 1))
ValueError: could not broadcast input array from shape (3,5441,18) into shape (3,5441,17)
I also get this issue, could you tell me how to resolve it?

i solve it. you can in the demo_offline.py in multy_pose manually change the number of key points from 18 to 17. Just pay attention to the corresponding relationship of key points

Fanthers commented 3 years ago

@wingskh @BozhouZha thx you two for the discussion above. It gives me a lot of help.

xiaoming970115 commented 1 year ago

would you tell me how to visualize using hrnet instead of using openpose API ? How you modify the command to run the demo/ I need your help @wingskh @pranavgundewar @Fanthers @SKBL5694