Closed bazylhorsey closed 1 year ago
I'll transfer this over to huggingface_hub
cc @Wauplin
Hi @bazylhorsey sorry for the late response, I was off for some time. Looking at the errors, I am not sure that it is a firewall issue since you are getting a proper response from the server (Repository Not Found for url: https://huggingface.co/arvist/ppe-test/resolve/main/config.json.
). My first guess would be that this repo is private but you're not sending the access token.
Could you try a few things on your cloud cluster:
huggingface-cli login
to login. This will save your access token on the disk of the machine so be careful if it's a shared one.huggingface-cli whoami
to check token is valid. It should print your username.>>> from huggingface_hub import hf_hub_download
>>> hf_hub_download('arvist/ppe-test', 'config.json')
To try download the file manually. This is what transformers
is doing under the hood.
3.bis. if you don't want to login on your cloud machine, run
>>> from huggingface_hub import hf_hub_download
>>> hf_hub_download('arvist/ppe-test', 'config.json', token='hf_****')
If any step fails, please let me know. Run huggingface-cli env
and copy-paste the output in this issue, that could help found out the issue.
Hope this will help, Cheers, Wauplin :hugs:
System Info
transformers = "^4.31.0" python = "3.10" Running on elastic beanstalk ec2
Who can help?
@Narsil
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
play with firewall and watch behavior
On local it works as expected, and only valid api keys download the repo. We are using a private repo named arvist/scanner-basic, which used to be named arvist/ppe-test. We noticed its strange in the error stack that it shows arvist/ppe-test for what was attempted to load in even though arvist/ppe-test is mentioned nowhere in our current deployment.
However, on cloud with correct credentials we get this error which is the same as having the wrong credentials on local:
We think perhaps this is a firewall issue, but there is no mention of what traffic rules are required, we tried inbound 443 TCP with no changes, is the git port used at all here? It seems all that is used is the requests library which is http-based.
Expected behavior
Client should be able to download on cloud system, or firewall details should be documented.
Our use case involves downloading huggingface repos at server boot and when they're needed followed by inference/predictions.
Local and cloud behavior should be consistent using the same key in web-based inferencing services.