Thank you for all your work on this project, it's really great to have a fully OSS Llama backbone.
I was excited to see the V2 version of the models with the original Llama tokenizer, and found that using the 7B model, performance (measured by perplexity) was indeed improved over the V1 model.
Are there plans to train a V2 version of the 13B model? If so, any idea of an ETA for that?
Thank you for all your work on this project, it's really great to have a fully OSS Llama backbone.
I was excited to see the V2 version of the models with the original Llama tokenizer, and found that using the 7B model, performance (measured by perplexity) was indeed improved over the V1 model.
Are there plans to train a V2 version of the 13B model? If so, any idea of an ETA for that?