npuichigo / openai_trtllm

OpenAI compatible API for TensorRT LLM triton backend
MIT License
155 stars 25 forks source link

support for llama 3 #43

Open avianion opened 4 months ago

avianion commented 4 months ago

will this project plan to support llama 3 70b or 8b?

npuichigo commented 4 months ago

llama3 should already be supported with template https://github.com/npuichigo/openai_trtllm/blob/main/templates/history_template_llama3.liquid. To get the model, please refer to https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/llama#llama-v3-updates

avianion commented 4 months ago

ok great. @npuichigo but what is the model name? it keeps saying to me model not found and i have tried many model names. with llama 3 70b

npuichigo commented 4 months ago

it's ensemble if the structure looks like https://github.com/triton-inference-server/tensorrtllm_backend/tree/v0.9.0/all_models/inflight_batcher_llm

avianion commented 4 months ago

Should skip_special_tokens be True or False? and same with add_special_tokens in the preprocessing config.pbtxt?