langchain-ai / langchain-nvidia

MIT License
48 stars 15 forks source link

add register_model for users to add their own custom models with endpoints [api change; expansion] #57

Closed mattf closed 2 months ago

mattf commented 2 months ago

this expands the public api with a Model class and register_model function.

background: we currently support hosted (base_url of integration.api.nvidia.com/v1) as well as local NIMs (base_url controlled by user, e.g. localhost:1234/v1) using the base_url feature. the base_url must provide paths for model listing, /models, and inference, one of /chat/completion, /embeddings or /ranking.

use case: users would like to support endpoints that only support inference (do not support model listing) and to minimize code changes when model names stay the same and endpoints change.

design -

from langchain_nvidia_ai_endpoints import register_model, Model, ChatNVIDIA
register_model(Model(id="my-custom-model-name", model_type="chat", client="ChatNVIDIA", endpoint="http://host:port/path-to-my-model"))
llm = ChatNVIDIA(model="my-custom-model-name")

rejected design -

from langchain_nvidia_ai_endpoints import ChatNVIDIA
llm = ChatNVIDIA(model="my-custom-model-name", endpoint="http://host:port/path-to-my-model")

reason for rejection: endpoints will change over time resulting in multiple refactoring points in user code. accepted design allows users to centrally add a model and then refer to it throughout their code is simpler.

sidkoch commented 2 months ago

Thanks Matt for taking this up. Really useful for our workflows requiring custom handlers. Looking forward to this functionality.