Can NeMo be used with TitanML?

Wildshire commented 9 months ago

Hello

I was wondering if NeMo is compatible with TitanML, and if so, I want to ask if there are any tutorials with it.

Thanks in advance

drazvan commented 9 months ago

Since there is already a LangChain connector, it should work: https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/llms/titan_takeoff.py. Something along the lines:

models:
- type: main
  engine: titan_takeoff
  ...

I don't know the exact other parameters, but it should work. Let us know if you need help.

saradiazdelser commented 8 months ago

Hi! I deployed TitanML with NeMo Guardrails -- it works well!

Since it's integrated with LangChain, you don't need to define and register a custom provider, so its actually pretty simple.

Below is a quick guide for future reference:

Deploy your model with TitanML using Docker by following the official documentation (community version) here:

docker run --gpus all -e TAKEOFF_MODEL_NAME=TheBloke/Mistral-7B-Instruct-v0.2-GPTQ -e TAKEOFF_DEVICE=cuda \\
    -e TAKEOFF_QUANT_TYPE=bfloat16 -p 3000:3000 -p 80:80 -v ~/.iris_cache:/code/models tytn/fabulinus:latest

By exposing the ports in the Docker container, you can access the model from your local host.

Modify the config YAML as follows:

models:
  - type: main
    engine: titan_takeoff
    model: TheBloke/Mistral-7B-Instruct-v0.2-GPTQ
    parameters:
      base_url: "<http://localhost>"

If you are deploying an open-source model, you will probably need to change the prompts and system instructions as well.

Deploy the guardrails as usual.

import logging
from pathlib import Path

import nest_asyncio
from nemoguardrails import LLMRails, RailsConfig

nest_asyncio.apply()
logging.basicConfig(level=logging.DEBUG)

# Load a guardrails configuration from the specified path.
path_to_config = Path.cwd() / "titanml_mistral" / "config"
config = RailsConfig.from_path(str(path_to_config))
rails = LLMRails(config, verbose=True)

# Generate
completion = rails.generate(
    messages=[{"role": "user", "content": "Hi! How are you?"}]
)

print(completion)

Wildshire commented 8 months ago

Hi @saradiazdelser

Thanks a lot for the info! I tried it and it has worked fine.

NVIDIA / NeMo-Guardrails

Can NeMo be used with TitanML? #295