Open vrushankportkey opened 1 month ago
Models hosted using Nvidia's NeMo servers expose Nvidia's Triton inference API's, we have a PR for that already (currently only for text completions, not chat completions) https://github.com/Portkey-AI/gateway/pull/445
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Models hosted using Nvidia's NeMo servers expose Nvidia's Triton inference API's, we have a PR for that already (currently only for text completions, not chat completions) https://github.com/Portkey-AI/gateway/pull/445
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html