Open wangyf8848 opened 1 month ago
Hello, the paper mentions using Llama for autoregressive training, but why is the language model in the code using GPT ?
It's only a name. The GPT architecture in this project has the same arch as Llama
Hello, the paper mentions using Llama for autoregressive training, but why is the language model in the code using GPT ?