Open SuperiorDtj opened 11 months ago
the test auido is 32-channel 2**15-length, for the batch 2 Besides, the num of trainable paras of the text condition generationis only 672M when follow the paper setting(text embding dim is 768 for t5-base)
the test auido is 32-channel 2**15-length, for the batch 2 Besides, the num of trainable paras of the text condition generationis only 672M when follow the paper setting(text embding dim is 768 for t5-base)