sign-language-processing / transcription

Text to pose model for sign language pose generation from a text sequence
35 stars 16 forks source link

相关论文 #5

Closed herochen7372 closed 1 year ago

herochen7372 commented 1 year ago

你好,我对你的工作非常感兴趣,请问你有pose-to-text的论文或者文字描述吗?

AmitMY commented 1 year ago

You are correct that this repository lacks documentation. There are no related papers at the moment, and I hope my thesis will cover all of these in the near future.

Google Translate: 您是正确的,此存储库缺少文档。 目前还没有相关论文,我希望我的论文能在不久的将来涵盖所有这些。

herochen7372 commented 1 year ago

你的代码写的很漂亮,但是以我目前的能力,有的地方没看明白.想问一下您,pose-to-text中的model关于forward我不是很明白.

  1. 关于输入参数first_pose具体是什么,是随机生成的吗?
  2. 初始定义的pose_sequence作用是什么?
  3. while循环在做什么,refine_pose_sequence不是直接可以得到pose_sequence吗? 打扰作者,真的很抱歉.后期如果我要发论文等工作,会注明您的贡献.
AmitMY commented 1 year ago

Surely you mean text-to-pose - Please read the README https://github.com/sign-language-processing/transcription/tree/main/text_to_pose

initial_frame in training, is the first frame of a sequence, so we have information about the look of the person, but not about the movement of the sign. The while loop works as a anti-diffusion step, to refine the sequence better and better. At any stage it yields the updated sequence

Check out this image to understand the refinement process for images, for example:

herochen7372 commented 1 year ago

I see. Thank you very much for your reply.