tokens = torch.tensor([int(token) for token in output.split(',')]).cuda()
Does it mean you use the same vocabulary size as llama, and output the motion tokens by outputing the number string separated by comma? If so, why not increase the vocabulary size?
https://github.com/qiqiApink/MotionGPT/blob/main/generate_motion.py#L114 In this line, you run the following code:
Does it mean you use the same vocabulary size as llama, and output the motion tokens by outputing the number string separated by comma? If so, why not increase the vocabulary size?