Open kkr37 opened 10 months ago
not supported yet?
As more and more new models enter the market, we have prepared comprehensive instructions for TRT-LLM developers on adapting to new models of interest. We encourage our community developers to expand the range of supported models, fostering an open ecosystem with rapid iterations.
Please try following these instructions and let us know if you encounter any issues during the adaptation process. We greatly appreciate your dedication.
Feature request Nous Research and EleutherAI have released the YaRN model, which comes in two versions with context sizes of 64k and 128k. This model utilizes RoFormer-style embeddings, distinguishing it from GPT-NeoX and GPT-J. It is built upon the foundation of the LLaMa 2 model, making it largely compatible with some minor adjustments required for optimal support.
Motivation The YaRN model's longer context length (up to 128k) is highly valuable for tasks involving extensive context, compared to the limited 4096 context length of the llama2 base model.
Other YaRN paper: YaRN: Efficient Context Window Extension of Large Language Models YaRN Code: YaRN Github