NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
Other
4k stars 365 forks source link

[Question] Run the code locally , HF isn't available #742

Open snassimr opened 1 week ago

snassimr commented 1 week ago

While running the code below :

`config = RailsConfig.from_content( colang_content=colang_content, yaml_content=yaml_content )

import nest_asyncio nest_asyncio.apply()

guardrails = RunnableRails(config)`

I see some 5 files are fetched .

Fetching 5 files: 100%  5/5 [00:01<00:00,  1.17it/s] tokenizer.json: 100%  712k/712k [00:00<00:00, 3.78MB/s] special_tokens_map.json: 100%  695/695 [00:00<00:00, 7.23kB/s] config.json: 100%  650/650 [00:00<00:00, 5.26kB/s] tokenizer_config.json: 100%  1.43k/1.43k [00:00<00:00, 12.4kB/s] model.onnx: 100%  90.4M/90.4M [00:01<00:00, 99.0MB/s]

How to run the same code without refering HF each time ?

Pouyanpi commented 1 week ago

Hi @snassimr ,

It is because of the default embedding search provider. NeMo Guardrails uses ""all-MiniLM-L6-v2" as the embedding model and "FastEmbed" as embedding provider.

So you can use a different provider or model, and customize it as your environment allows you.

Please refer to the documentation here. Let me know if it addresses your question.

snassimr commented 6 days ago

Hi @Pouyanpi , Thank you for your response . FastEmbed still doesn't have to load embeddings from local path. So , currently I don't see any way to drop dependency on HF.