h2oai / h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
http://h2o.ai
Apache License 2.0
11.15k stars 1.22k forks source link

NVIDIA Triton inference support #87

Open arnocandel opened 1 year ago

arnocandel commented 1 year ago

https://github.com/triton-inference-server/

arnocandel commented 1 year ago

https://github.com/triton-inference-server/fastertransformer_backend/blob/main/docs/gptneox_guide.md https://github.com/triton-inference-server/fastertransformer_backend/tree/main/all_models/gptneox