Closed FrancescoVV closed 9 months ago
ah I see, you've reached the fine transformer stage
are you sampling more than one audio at a time?
yeah, I can get a naive solution for this issue this am
@FrancescoVV ok, try 1.5.7
I can try on Friday or the weekend, but unfortunately not before that. I will close the issue myself if it's fixed.
In any case, the issued exists both with batch size 1 or more during generation.
@FrancescoVV sounds good, i think it is fixed, but you can do the honors
The issue is fixed now!
noice
Encodec doesn't support "-1" that is used to mask the tokens after EOS.
In particular, the coarse_token_ids here contain some trailing -1 and thus the line
coarse_and_fine_ids = torch.cat((coarse_token_ids, sampled_fine_token_ids), dim = -1)
Will still have some "-1s" that will not be recognised by Encodec