lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
MIT License
2.36k stars 255 forks source link

semantic transformer training #154

Closed syjunghwang closed 1 year ago

syjunghwang commented 1 year ago

When I learn Semantic Transformer, I see the code that make the semantic embedding a random value and train this embedding. I need to get the semantic token ID value from Hubert, but taking the embedding value corresponding to token from pre-trained Hubert is not correct. right?

lucidrains commented 1 year ago

@syjunghwang yea, that is not correct

stg1205 commented 9 months ago

why don't take the embedding from pretrained Hubert?