Guanghan / lighttrack

LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking
MIT License
721 stars 141 forks source link

Replace pose estimator with another CPU-based only #14

Open slayo11 opened 5 years ago

slayo11 commented 5 years ago

Hi,

I would like to replace the pose estimator with a lightweight implementation of OpenPose, based only on CPU inference since my computer does not support CUDA.

However when I try to compile the 'lib' folder with 'make' I get an error due to the missing CUDA support. Is it possible to do the replacement explained above or CUDA is a prerequisite for other parts of Lighttrack too?

Thank you, Cataldo

Guanghan commented 5 years ago

Hi Cataldo, the lib folder is for the tensorflow implementation of the pose estimators (e.g., Mobile-Deconv, CPN, MSRA-Deconv). If you replace pose estimator with OpenPose, I believe you could just ignore this folder and not compile it.

slayo11 commented 5 years ago

Thank you for your reply, Guanghan. I am trying to replace the pose estimator and the detector (it is embedded in the pose estimator) but I can't figure out the structure of keypoints returned by inference_keypoints(...) in demo_camera_mobile.py. Being not able to run the code, since my system does not support CUDA, is not helping :\ Right now my module outputs an array like this: [x_1, y_1, score_1, ..., x_17, y_17, score_17]. What structure does the demo expect?

Thank, Cataldo

Guanghan commented 5 years ago

@caloc As you can see in line 64 of keypoint_visualizer.py, the keypoints' order are given. In the provided code, I used PoseTrack order because I trained with this dataset. If you replace the pose estimator, you can actually use any order with an arbitrary number of keypoints too. The tracker can take a list [x_1, y_1, score_1, ..., x_n, y_n, score_n]. Currently, it is [x_1, y_1, score_1, ..., x_15, y_15, score_15]. The (x,y) coordinate is the absolute pixel location in the original image.

slayo11 commented 5 years ago

Thank you, then I will try to use the keypoints list as it is.

slayo11 commented 5 years ago

@Guanghan, I have another doubt regarding the notation of keypoints. In the definition of get_bbox_from_keypoints(keypoints_python_data) you check that vis is different from 0 and from 3: vis = keypoints_python_data[3 * keypoint_id + 2] if vis != 0 and vis!= 3: x_list.append(x) y_list.append(y) In my case I have the the third (in modulo) position of the array is a score, with 0 meaning undected keypoint. Here what is the meaning of value 3?

Guanghan commented 5 years ago

@caloc Sorry for the untidiness of the code. I believe comparing value 3 is unnecessary. I may have conducted some experiments or debugged something but failed to clean up all unnecessary code afterwards.

slayo11 commented 5 years ago

@Guanghan I had the doubt that it was a debugging line, no problem. I think I am almost there, now I got the mismatch error during the inference using the SGCN.

File "/home/cataldo/.local/lib/python3.5/site-packages/torchlight-1.0-py3.5.egg/torchlight/io.py", line 82, in load_weights __doc__ = _io._TextIOBase.__doc__ File "/home/cataldo/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Model: size mismatch for A: copying a param with shape torch.Size([3, 15, 15]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]). size mismatch for data_bn.running_var: copying a param with shape torch.Size([30]) from checkpoint, the shape in current model is torch.Size([36]). size mismatch for data_bn.weight: copying a param with shape torch.Size([30]) from checkpoint, the shape in current model is torch.Size([36]). size mismatch for data_bn.running_mean: copying a param with shape torch.Size([30]) from checkpoint, the shape in current model is torch.Size([36]). size mismatch for data_bn.bias: copying a param with shape torch.Size([30]) from checkpoint, the shape in current model is torch.Size([36]). size mismatch for edge_importance.1: copying a param with shape torch.Size([3, 15, 15]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]). size mismatch for edge_importance.0: copying a param with shape torch.Size([3, 15, 15]) from checkpoint, the shape in current model is torch.Size([3, 18, 18]).

How do I retrain the SGCN with COCO 18 notation? Or is there an already trained net?

Thank you for your support!

Guanghan commented 5 years ago

@caloc Unfortunately, I have not trained SGCN with COCO. Training SGCN requires data in pairs (or triplets if necessary), where the image pairs are usually within a few frames in a video sequence. COCO dataset seems to provide independent images, thus hard for SGCN training. One suggestion is to generate synthetic data with COCO notation and use it for training.

Guanghan commented 5 years ago

@caloc For now, you can set the threshold of SGCN to be 0, which turns it off. It sacrifices the MOTA score a bit, but it does not matter too much in demo. In this way, you can at least check whether your CPU version will run successfully.

li3ro commented 5 years ago

@caloc, It will be great if you could expose the CPU version, if you managed to get it to work

usamahjundia commented 4 years ago

@caloc hi, im interested in using this method with bottom up methods like openpose too. Since openpose is bottom up, did you use the keypoints from openpose to construct the bbox of each person?