Closed freemansoft closed 2 months ago
This PR may conflict with https://github.com/NVIDIA/nim-anywhere/pull/31
Thanks Joe. The trick here is that the nvapi keys are tied to specific teams in NGC. The old style keys are not specific to a team.
So you'll need to create your nvapi key in the team that you are a member of that has access to the embed and rerank services.
Thanks for the reply. I have literally nine different keys I created by poking around. None of the them work for everything :-( I'm not sure I'm even in a team let alone one that has access to the embed and rerank service.
I can remove the extra key in this PR or drop it since it overlaps. Note that this PR turns on the health check services like they are enabled in the LLM contianer, or at least I think it does because the cluster status moves through the right states.
So you'll need to create your nvapi key in the team that you are a member of that has access to the embed and rerank services.
I have no idea what this means or how to do it. I have a team but I don't see any way to assign keys to a team.
I wanted to demonstrate/run everything local so there was no chance of data leakage.
There is a bit of a hack in here because the NGC_API_KEY needs two have two different values in two different places. The rerank and embed services need a non
nvapi-
key to pull down their models on startup. The chat front end requires anvapi-
key to talk to the remote nvidia endpoints. Chat startup fails if the key doesn't start withnvapi-
. So I added a second NGC_API_KEY_2 with a description of where to get the keyTested on Ubuntu with a TITAN RTX card.