Why not update the vocabulary size of llama?

qiqiApink / MotionGPT

The official PyTorch implementation of the paper "MotionGPT: Finetuned LLMs are General-Purpose Motion Generators"

https://qiqiapink.github.io/MotionGPT/

189 stars 11 forks source link

Why not update the vocabulary size of llama? #8

Open RalphHan opened 1 year ago

RalphHan commented 1 year ago

https://github.com/qiqiApink/MotionGPT/blob/main/generate_motion.py#L114 In this line, you run the following code:

tokens = torch.tensor([int(token) for token in output.split(',')]).cuda()

Does it mean you use the same vocabulary size as llama, and output the motion tokens by outputing the number string separated by comma? If so, why not increase the vocabulary size?

vadeli commented 3 months ago

I have the same question. Did you figure it out?