NVIDIA / nim-anywhere

Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench
https://www.nvidia.com/en-us/ai/
Apache License 2.0
103 stars 62 forks source link

This lets you run embed and rerank locally. #32

Closed freemansoft closed 2 months ago

freemansoft commented 2 months ago

I wanted to demonstrate/run everything local so there was no chance of data leakage.

There is a bit of a hack in here because the NGC_API_KEY needs two have two different values in two different places. The rerank and embed services need a non nvapi- key to pull down their models on startup. The chat front end requires a nvapi- key to talk to the remote nvidia endpoints. Chat startup fails if the key doesn't start with nvapi-. So I added a second NGC_API_KEY_2 with a description of where to get the key

Tested on Ubuntu with a TITAN RTX card.

freemansoft commented 2 months ago

This PR may conflict with https://github.com/NVIDIA/nim-anywhere/pull/31

rmkraus commented 2 months ago

Thanks Joe. The trick here is that the nvapi keys are tied to specific teams in NGC. The old style keys are not specific to a team.

So you'll need to create your nvapi key in the team that you are a member of that has access to the embed and rerank services.

freemansoft commented 2 months ago

Thanks for the reply. I have literally nine different keys I created by poking around. None of the them work for everything :-( I'm not sure I'm even in a team let alone one that has access to the embed and rerank service.

I can remove the extra key in this PR or drop it since it overlaps. Note that this PR turns on the health check services like they are enabled in the LLM contianer, or at least I think it does because the cluster status moves through the right states.

freemansoft commented 2 months ago

So you'll need to create your nvapi key in the team that you are a member of that has access to the embed and rerank services.

I have no idea what this means or how to do it. I have a team but I don't see any way to assign keys to a team.