ai4r / Gesture-Generation-from-Trimodal-Context

Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity (SIGGRAPH Asia 2020)
Other
245 stars 35 forks source link

AudioEncoder에 대하여 #44

Closed HyeonSeong-P closed 1 year ago

HyeonSeong-P commented 2 years ago

synthesize 시 from_text 옵션을 사용했을 때 tts로 생성된 speech audio를 encoding하여 이를 generator가 input으로 받나요?

youngwoo-yoon commented 2 years ago

네 맞습니다.