Open yangdongchao opened 1 year ago
Hey, thanks for your interest in this repo! The RVQ parameters are updated via ema. See this relevant thread
Hey, thanks for your interest in this repo! The RVQ parameters are updated via ema. See this relevant thread
thanks for your reply. I want to ask how to judge the training is successful? When we can stop the training?
I stopped when the eval loss stopped going down. seems to work good enough.
I stopped when the eval loss stopped going down. seems to work good enough.
Thanks for your relpy. Lastly ,I want to ask, whether you only use audio to extract embedding and use RVQ to quantize it in the training proces. The text is only used in the inference stage?
It seems that in the ClapRVQTrainer code, you donot use any gradient backward? How to understood this?