I previously ran the run_clm.py example and populated the public optimum-neuron-cache with the graphs. When I run subsequent jobs, there are a handful of smaller graphs that are JiT-compiled, however the larger graphs all seem to be present in the cache, and training proceeds quickly.
The job completes successfully, but at the end of the job there are many 'invalid username or password' errors when Optimum Neuron is attempting to push the newly compiled graphs to the hub cache. This is undesirable and creates a lot of noise. This push should not be attempted unless the user has requested it.
Error message:
Repository Not Found for url: https://huggingface.co/api/models/aws-neuron/optimum-neuron-cache/preupload/main.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.
Note: Creating a commit assumes that the repo already exists on the Huggingface Hub. Please use `create_repo` if it's not the case..
Did not push the cached model located at /tmp/tmpb7e4bn_8/neuronxcc-2.11.0.34+c5231f848/MODULE_15768223958212646773+d41d8cd9/model.hlo.pb to the repo named aws-neuron/optim
um-neuron-cache because it already exists there. Use overwrite_existing=True if you want to overwrite the cache on the Hub.
Could not push the cached model to the repo aws-neuron/optimum-neuron-cache, most likely due to not having the write permission for this repo. Exact error:
401 Client Error. (Request ID: Root=1-65564fae-7c389aa421abc14a7a66f568;5a5e9ea7-fef5-428a-a88f-063460b1e48d)
I previously ran the run_clm.py example and populated the public optimum-neuron-cache with the graphs. When I run subsequent jobs, there are a handful of smaller graphs that are JiT-compiled, however the larger graphs all seem to be present in the cache, and training proceeds quickly.
The job completes successfully, but at the end of the job there are many 'invalid username or password' errors when Optimum Neuron is attempting to push the newly compiled graphs to the hub cache. This is undesirable and creates a lot of noise. This push should not be attempted unless the user has requested it.
Error message:
Launch command:
Env:
Python packages: