Closed Wildshire closed 8 months ago
Since there is already a LangChain connector, it should work: https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/llms/titan_takeoff.py. Something along the lines:
models:
- type: main
engine: titan_takeoff
...
I don't know the exact other parameters, but it should work. Let us know if you need help.
Hi! I deployed TitanML with NeMo Guardrails -- it works well!
Since it's integrated with LangChain, you don't need to define and register a custom provider, so its actually pretty simple.
Below is a quick guide for future reference:
docker run --gpus all -e TAKEOFF_MODEL_NAME=TheBloke/Mistral-7B-Instruct-v0.2-GPTQ -e TAKEOFF_DEVICE=cuda \\
-e TAKEOFF_QUANT_TYPE=bfloat16 -p 3000:3000 -p 80:80 -v ~/.iris_cache:/code/models tytn/fabulinus:latest
By exposing the ports in the Docker container, you can access the model from your local host.
models:
- type: main
engine: titan_takeoff
model: TheBloke/Mistral-7B-Instruct-v0.2-GPTQ
parameters:
base_url: "<http://localhost>"
If you are deploying an open-source model, you will probably need to change the prompts and system instructions as well.
import logging
from pathlib import Path
import nest_asyncio
from nemoguardrails import LLMRails, RailsConfig
nest_asyncio.apply()
logging.basicConfig(level=logging.DEBUG)
# Load a guardrails configuration from the specified path.
path_to_config = Path.cwd() / "titanml_mistral" / "config"
config = RailsConfig.from_path(str(path_to_config))
rails = LLMRails(config, verbose=True)
# Generate
completion = rails.generate(
messages=[{"role": "user", "content": "Hi! How are you?"}]
)
print(completion)
Hi @saradiazdelser
Thanks a lot for the info! I tried it and it has worked fine.
Hello
I was wondering if NeMo is compatible with TitanML, and if so, I want to ask if there are any tutorials with it.
TitanML
Thanks in advance