EricGuo5513 / text-to-motion

Official implementation for "Generating Diverse and Natural 3D Human Motions from Texts (CVPR2022)."
MIT License
462 stars 40 forks source link

Time to arrival #11

Closed yonishafir closed 1 year ago

yonishafir commented 1 year ago

It seems to me that you use positional encoding, although in the paper you mention to use time to arrival.

Could you please forward to where you implement and use time to arrival positional encoding?

EricGuo5513 commented 1 year ago

https://github.com/EricGuo5513/text-to-motion/blob/main/networks/trainers.py#:~:text=%2D1%5D)-,tta%20%3D%20m_lens%20//%20self.opt.unit_length%20%2D%20i,-if%20self.

Hi, please check here.

yonishafir commented 1 year ago

I could not find the line where your'e using time to arrival in the above reference.

instead I see both in: https://github.com/EricGuo5513/text-to-motion/blob/main/networks/modules.py#L43

and in: https://github.com/EricGuo5513/text-to-motion/blob/main/networks/modules.py#L62

Positional encoding (forward).

I'll be happy for a better reference.

thanks

EricGuo5513 commented 1 year ago

Hi, the time-to-arrival is technically implemented by positional encoding. While the difference is that, usually people would feed the current time step t for positional encoding, our time-to-arrival feed T-t for positional encoding, where T is the target length and t is the current time step. This could make the model know how long the generation process is yet to be done.

yonishafir commented 1 year ago

Thanks for the detailed answer, I couldn't find where you feed your model positional encoding with t=T-t (instead of t) can you please kindly direct me to the correct line in the trainer?

EricGuo5513 commented 1 year ago

It is at the line 334 in https://github.com/EricGuo5513/text-to-motion/blob/main/networks/trainers.py.

yonishafir commented 1 year ago

Thanks a lot