runpod-workers / worker-infinity-embedding

MIT License
18 stars 9 forks source link

Huggingface models requiring authentication #7

Closed axeloh closed 3 months ago

axeloh commented 3 months ago

First of all, thanks for the great work :)

I am deploying RunPod serverless endpoints using the worker-infinity image, and querying using the OpenAI SDK (with Runpod API key as api_key and the runpod endpoint as the base_url.

However, some of the huggingface models require authentication, for instance the Nvidia NV Embed model .

I can see the following from the pod log: Cannot access gated repo for url https://huggingface.co/nvidia/NV-Embed-v1/resolve/main/config.json. Access to model nvidia/NV-Embed-v1 is restricted. You must be authenticated to access it.

Is there any support for overcoming this?

michaelfeil commented 3 months ago

Have you tried using a HF_TOKEN env variable?

axeloh commented 3 months ago

No, could try that. Is it attempted read by Infinity?

axeloh commented 3 months ago

It worked, thanks :)