y-hwang / gLM

Genomic language model predicts protein co-regulation and function
https://www.biorxiv.org/content/10.1101/2023.04.07.536042v3
Other
66 stars 10 forks source link

Model downloading error. Is it possible to download it to user defined path? #10

Open Jigyasa3 opened 1 month ago

Jigyasa3 commented 1 month ago

Hi, thanks again for a great model! I am running the following example code to generate test.esm.embs.pkl file for which the gLM downloads the esm2_t33_650M_UR50D.pt file. But I am running into [Errno 122] Disk quota exceeded error. Is it possible to download the model to a user-defined path which has more storage space?

Code- conda activate glm-env sbatch --partition gpu --gpus 1 --wrap "python /home/jigyasaa/downloads/gLM/data/plm_embed.py /home/jigyasaa/downloads/gLM/data/example_data/inference_example/test.fa /groups/rubin/projects/jigyasa/eCIS/results/gLM_MLmodel/example_data/inference_example/test.esm.embs.pkl"

Error-

Downloading: "https://dl.fbaipublicfiles.com/fair-esm/models/esm2_t33_650M_UR50D.pt" to /home/jigyasaa/.cache/torch/hub/checkpoints/esm2_t33_650M_UR50D.pt
[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/jigyasaa/.pyenv/versions/3.11.3/lib/python3.11/site-packages/torch/hub.py", line 658, in download_url_to_file
[rank0]:     f.write(buffer)  # type: ignore[possibly-undefined]
[rank0]:     ^^^^^^^^^^^^^^^
[rank0]: OSError: [Errno 122] Disk quota exceeded

[rank0]: During handling of the above exception, another exception occurred:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/jigyasaa/downloads/gLM/data/plm_embed.py", line 29, in <module>
[rank0]:     model_data, regression_data = esm.pretrained._download_model_and_regression_data(model_name)
[rank0]:                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/jigyasaa/.pyenv/versions/3.11.3/lib/python3.11/site-packages/esm/pretrained.py", line 54, in _download_model_and_regression_data
[rank0]:     model_data = load_hub_workaround(url)
[rank0]:                  ^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/jigyasaa/.pyenv/versions/3.11.3/lib/python3.11/site-packages/esm/pretrained.py", line 33, in load_hub_workaround
[rank0]:     data = torch.hub.load_state_dict_from_url(url, progress=False, map_location="cpu")
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/jigyasaa/.pyenv/versions/3.11.3/lib/python3.11/site-packages/torch/hub.py", line 765, in load_state_dict_from_url
[rank0]:     download_url_to_file(url, cached_file, hash_prefix, progress=progress)
[rank0]:   File "/home/jigyasaa/.pyenv/versions/3.11.3/lib/python3.11/site-packages/torch/hub.py", line 670, in download_url_to_file
[rank0]:     f.close()
[rank0]: OSError: [Errno 122] Disk quota exceeded
riveSunder commented 1 month ago

Hi!

It looks like you're running out of space in ~/.cache/torch/hub, the torch hub directory. You can check and set your hub directory with torch.hub.get_dir() and torch.hub.set_dir("/path/with/space/hub") from within Python.

If you use torch.hub.set_dir and then leave and re-enter Python, you'll notice that the change is not persistent (it will reset to the default value). You can also set the directory torch will use from the command line:

export TORCH_HOME=/path/with/space

Note that torch.hub.set_dir sets the path to the hub directory, but TORCH_HOME refers to the directory that contains hub, one level above.

Setting environmental variables from the command line with export is not persistent either. But you can add the line above to your .bashrc file to set TORCH_HOME every time you open a new shell.

Alternatively, you can use a more complicated command to set TORCH_HOME when activating the environment.

eval "$(conda shell.bash activate glm-env) && export TORCH_HOME=/path/with/space"

If you use the last method, TORCH_HOME will still be set as your custom path after calling conda deactivate, but you can also append the command to unset this variable in the same way as before:

# deactivate conda env and removes enviromental variable TORCH_HOME
eval "$(conda shell.bash deactivate) && unset TORCH_HOME"

At which point your shell should be back to normal (can verify the variable was unset with echo $TORCH_HOME).

I think the options above should allow you to control where model parameters are downloaded and stored but I haven't fully replicated your issue to verify, so let me know if something goes wrong!

Jigyasa3 commented 3 weeks ago

Thank you so much @riveSunder for suggesting an option and explaining it. It works!