[request] Add example of custom LLM model not based on huggingface

Hi,

I'm wondering if it's possible to add example (or general guideline) of how to serving custom LLM model that's not based on huggingface.

As an example, we could use the original Llama3 chat model with the native Tiktoken tokenizer, which are not based on huggingface transformers: https://github.com/meta-llama/llama3

This will be great for people that are working with custom LLM models that are decoupled from the huggingface ecosystem, thanks!

triton-inference-server / tensorrtllm_backend

[request] Add example of custom LLM model not based on huggingface #465