Regarding Improving the Estimation

Adhithiyan03 commented 3 years ago

We are working on Gait Analysis using MoveNet for generating the angles of the knee. We had used the Thunder pre-trained model given but the results are not accurate.

The video generated is missing a lot of markers and therefore getting high variations in the graph we are generating and the error is too high. The estimation is somewhat better when tracking for the whole body but it is not working with the limb part alone? Is there any way to improve the pose estimation in giving better results when using the limb part alone?

I have attached the graph generated along with the 2D Pose-Estimation for the lower body video with this mail. I have also attached the graph which we are expecting to generate along with this mail. Hoping for a desirable response as it might help us in achieving our goal.

Google Drive Link Containing the video and the graph: https://drive.google.com/drive/folders/1pMh--1mhpS0fc6DURPhtphf6XFRBMMnx?usp=sharing

arghyaganguly commented 3 years ago

@Adhithiyan03 , please provide your code snippet in a reproducible format e.g Colab notebook/gist to help us expedite the debug process.Thanks.

Adhithiyan03 commented 3 years ago

https://colab.research.google.com/drive/1O4sxmD0UjIKUe-KW9jdcfJHi8ENczS5P?usp=sharing I have attached the code snippet. Basically, I have not changed anything which has already there. I wanted to check how the model works on the lower limb part.

Is there any method or model to improve the estimation?

akhorlin commented 3 years ago

We forwarded this question to the model's publisher to see if any additional feedback or suggestions could be provided.

yuhuichen1015 commented 3 years ago

Thanks for your interest in using MoveNet and reporting this issue! Without knowing much about your setting, it might be more difficult to locate the root cause of the accuracy issue, but here are a few thoughts:

1) Our model was trained with the images that usually contains the whole body/upper body, but not lower body along. Therefore, we do expect a bit performance degradation for your video. However, given that the person in your video is not doing weird poses but just walking. I do agree that the result is worse than I expected. One easy way to check is if you can zoom out a bit to include the whole body and see if the accuracy become normal. That way we can tell whether it is the data distribution issue.

2) Our model expects a square image input and works better when working with a proper cropping algorithm (see the example in the tutorial https://www.tensorflow.org/hub/tutorials/movenet). I can't tell whether you have that implemented in your setting, but one thing to note is that even without cropping, when feeding the images to the model, one shouldn't squash the image but should use padding to keep the original image aspect ratio. Could you double check how you do the preprocessing of the images?

3) Could you share which model (Thunder/Lightning) and which format (tfjs, tflite) you ended up using in your Colab? We don't expect them to perform differently but maybe there's a bug somewhere. One easy way to check is if you can try the tfjs demo (https://storage.googleapis.com/tfjs-models/demos/pose-detection/index.html?model=movenet) and see if you are getting different performance for exactly the same video.

Looking forward to hear more from your side! Thanks.

arghyaganguly commented 3 years ago

@Adhithiyan03 , please reply to the previous comment from @yuhuichen1015.

Adhithiyan03 commented 3 years ago

Sorry Sir, I am working on the methods @yuhuichen1015 has suggested and will reply by tomorrow. Sorry for the delay.

Adhithiyan03 commented 3 years ago

I have checked with the methods with which u have suggested.

The model is good when we include the whole body but for the limb part alone, the accuracy is very less.
We have not cropped the video or its image frame. As I am using the gif file, I have uploaded it and used tf.image.decode_gif to extract for each frame which is as same the code that is already present in the MoveNet Colab.

https://user-images.githubusercontent.com/83748642/124353173-9ca52700-dc22-11eb-8f55-962a4f9904e1.mp4

I have used the thunder model and tfjs. I have checked the video with the tfjs demo link which u have shared with me. The result in the demo and the colab are the same.

https://user-images.githubusercontent.com/83748642/124353239-fefe2780-dc22-11eb-8548-ae4fde33753c.mp4

So, I guess that it is better if we include the whole body for good estimation. But still is there any other pretrained model in MoveNet that can give good result for this video alone??

Adhithiyan03 commented 3 years ago

Hey, waiting for a response.

yuhuichen1015 commented 3 years ago

Thanks for sharing the updates. Few thoughts:

1) The tfjs demo link should have the proper preprocessing steps. If you can't get reasonable predictions from that link, that probably means the data is out-of-domain and the model just doesn't do well on them. As suggested, your best chance here is to zoom out a bit to include the whole body (or at least let the four body joints visible (two shoulders/two hips) in the image frame.)

2) However, out of curiosity, I tried to record myself doing similar moves using the tfjs demo link. It seems like I'm getting reasonable predictions (or at least better then what you shared)? That makes me wonder whether it is actually a preprocessing issue. A general recommendation is that one should provide a square image as the input without changing the image's original aspect ratio. It's better to use padding instead of streching the image when modifying the aspect ratio. Could you check whether that is the case in your code? Thanks.

ezgif com-gif-maker (11)

yuhuichen1015 commented 3 years ago

FYI: We have updated the model (V4) with improved accuracy and more support of different formats. Please feel free to give it a try and see if that works better for your case (https://tfhub.dev/s?q=movenet).

arghyaganguly commented 3 years ago

@Adhithiyan03 , please update on the previous comment from @yuhuichen1015 .Thanks.

arghyaganguly commented 3 years ago

@Adhithiyan03 , closing this as there has been no update on this for some time.Please feel free to reopen with insights/information based on above comment trace.Thanks.

tensorflow / hub

Regarding Improving the Estimation #776