matyasbohacek / spoter

Repository accompanying the "Sign Pose-based Transformer for Word-level Sign Language Recognition" paper
https://spoter.signlanguagerecognition.com
Apache License 2.0
80 stars 24 forks source link

如何在视频文件上测试 #14

Open NaNtaisuike opened 6 months ago

NaNtaisuike commented 6 months ago

作者你好,感谢你的工作!代码训练和测试都是将数据集转为了csv文件,我想请教一下,怎么在视频文件上进行测试,怎么直观的得到视频手语翻译的demo展示

matyasbohacek commented 1 month ago

Hi there! Pose extraction is not included as part of this repo. I do plan to open-source those scripts eventually, but I’m not sure when I’ll get around to it.

In the meantime, you can use any pose estimation toolkit and build a simple wrapper to convert the format. I’d recommend using MediaPipe, but any recent pose estimation toolkit should work. You’ll want to generate a CSV where columns correspond to individual landmarks (joints) and hold arrays of coordinate values. This script should give you a good idea of how the data is structured. Note that the _X and _Y components of the coordinates are in separate columns.

For the demo, you could use the core of this Gradio-based demo.