Question about the generate

Hello，I trained a semantic tokens -> acoustic tokens(3 codes) model，and I want to use the argmax to make every inference the same.

if argmax:
    print("use argmax")
    sampled = torch.argmax(last_coarse_logits, dim = -1)
else:
    print("not use argmax")
    filtered_logits = top_k(last_coarse_logits, thres = filter_thres)
    sampled = gumbel_sample(filtered_logits, temperature = temperature, dim = -1)

However，when in the argmax mode，the semantic tokens -> acoustic tokens(3 codes) -> wav，and the wav has no speech，with long silence，do u know Why?

lucidrains / audiolm-pytorch

Question about the generate #192