snap-stanford / SATURN

MIT License
108 stars 17 forks source link

FileNotFoundError: [Errno 2] No such file or directory: '/dfs/project/cross-species/yanay/data/proteome/embeddings/Xenopus_tropicalis.Xenopus_tropicalis_v9.1.gene_symbol_to_embedding_ESM2.pt' #9

Closed MohammedZidane closed 1 year ago

MohammedZidane commented 1 year ago

Hi, While running this cell in Train SATURN.ipynb

!python3 ../../train-saturn.py --in_data=data/frog_zebrafish_run.csv --in_label_col=cell_type --ref_label_col=cell_type --num_macrogenes=2000 --hv_genes=8000 --centroids_init_path=saturn_results/fz_centroids.pkl --score_adata --ct_map_path=data/frog_zebrafish_cell_type_map.csv --work_dir=. --device_num=0 I got this error

Global seed set to 0 Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex) Using Device 0 Set seed to 0 Traceback (most recent call last): File "/home/mohammed/SATURN/Vignettes/frog_zebrafish_embryogenesis/../../train-saturn.py", line 1056, in trainer(args) File "/home/mohammed/SATURN/Vignettes/frog_zebrafish_embryogenesis/../../train-saturn.py", line 501, in trainer adata, species_gene_embeddings = load_gene_embeddings_adata( File "/home/mohammed/SATURN/data/gene_embeddings.py", line 95, in load_gene_embeddings_adata species_to_gene_symbol_to_embedding = { File "/home/mohammed/SATURN/data/gene_embeddings.py", line 98, in for gene_symbol, gene_embedding in torch.load(embedding_path).items() File "/home/mohammed/.local/lib/python3.10/site-packages/torch/serialization.py", line 791, in load with _open_file_like(f, 'rb') as opened_file: File "/home/mohammed/.local/lib/python3.10/site-packages/torch/serialization.py", line 271, in _open_file_like return _open_file(name_or_buffer, mode) File "/home/mohammed/.local/lib/python3.10/site-packages/torch/serialization.py", line 252, in init super().init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: '/dfs/project/cross-species/yanay/data/proteome/embeddings/Xenopus_tropicalis.Xenopus_tropicalis_v9.1.gene_symbol_to_embedding_ESM2.pt'

Note: I did not generate the h5ad files using dataloader.ipynb. I directly received them via email and put them in the right directory SATURN\Vignettes\frog_zebrafish_embryogenesis\data Thanks!

Yanay1 commented 1 year ago

Make sure that you download or create the protein embedding files and modify the path to them in the csv.

In the Train Saturn notebook in the frog and zebrafish folder in Vignettes, this is done in the cell that has the comment:

##### CHANGE THESE PATHS #####
MohammedZidane commented 1 year ago

oops, I did not run the Generate Protein Embeddings notebook. Thanks! But is it possible to directly download it? what I got from the instructions is you whether create your own data or generate the data you used in your paper from Generate Protein Embeddings notebook. if there is a way to download it, please share it with me because I do not think I saw a link to download it from the paper or the instructions. Sorry for the inconvenience and thanks for your quick replies:)

Yanay1 commented 1 year ago

You can download the protein embeddings from: http://snap.stanford.edu/saturn/data/protein_embeddings.tar.gz

Checkout this section: https://github.com/snap-stanford/SATURN#data-availability

MohammedZidane commented 1 year ago

Thank you so much for your help, really appreciate it :)

Yanay1 commented 1 year ago

You're welcome! 😄