Open Jigyasa3 opened 1 month ago
Hi!
It looks like you're running out of space in ~/.cache/torch/hub
, the torch hub directory. You can check and set your hub directory with torch.hub.get_dir()
and torch.hub.set_dir("/path/with/space/hub")
from within Python.
If you use torch.hub.set_dir
and then leave and re-enter Python, you'll notice that the change is not persistent (it will reset to the default value). You can also set the directory torch will use from the command line:
export TORCH_HOME=/path/with/space
Note that torch.hub.set_dir
sets the path to the hub
directory, but TORCH_HOME
refers to the directory that contains hub
, one level above.
Setting environmental variables from the command line with export
is not persistent either. But you can add the line above to your .bashrc
file to set TORCH_HOME
every time you open a new shell.
Alternatively, you can use a more complicated command to set TORCH_HOME
when activating the environment.
eval "$(conda shell.bash activate glm-env) && export TORCH_HOME=/path/with/space"
conda shell.bash activate glm-env
writes a shell script to activate the environment, but does not execute it. You can check the output (the shell script) by running this part on its own. &&
separates the commands in the shell script from export TORCH_HOME
, which is the line from earlier, for setting the path to the directory containing hub
(where model parameters should be downloaded). You can also use ;
instead of &&
. &&
ensures that downstream code isn't executed if something goes wrong with the first part. eval
executes the shell code. You can use echo
in place of eval
to print the script and inspect the shell code.If you use the last method, TORCH_HOME
will still be set as your custom path after calling conda deactivate
, but you can also append the command to unset
this variable in the same way as before:
# deactivate conda env and removes enviromental variable TORCH_HOME
eval "$(conda shell.bash deactivate) && unset TORCH_HOME"
At which point your shell should be back to normal (can verify the variable was unset with echo $TORCH_HOME
).
I think the options above should allow you to control where model parameters are downloaded and stored but I haven't fully replicated your issue to verify, so let me know if something goes wrong!
Thank you so much @riveSunder for suggesting an option and explaining it. It works!
Hi, thanks again for a great model! I am running the following example code to generate
test.esm.embs.pkl
file for which the gLM downloads theesm2_t33_650M_UR50D.pt
file. But I am running into[Errno 122] Disk quota exceeded
error. Is it possible to download the model to a user-defined path which has more storage space?Code-
conda activate glm-env
sbatch --partition gpu --gpus 1 --wrap "python /home/jigyasaa/downloads/gLM/data/plm_embed.py /home/jigyasaa/downloads/gLM/data/example_data/inference_example/test.fa /groups/rubin/projects/jigyasa/eCIS/results/gLM_MLmodel/example_data/inference_example/test.esm.embs.pkl"
Error-