snap-stanford / UCE

UCE is a zero-shot foundation model for single-cell gene expression data
MIT License
120 stars 15 forks source link

File potentially missing #14

Closed y-doctor closed 6 months ago

y-doctor commented 6 months ago

Traceback (most recent call last): Wrote Shapes Dict Traceback (most recent call last): File "/tscc/nfs/home/ydoctor/datasets/UCE/eval_single_anndata.py", line 155, in main(args, accelerator) File "/tscc/nfs/home/ydoctor/datasets/UCE/eval_single_anndata.py", line 84, in main processor.generate_idxs() File "/tscc/nfs/home/ydoctor/datasets/UCE/evaluate.py", line 122, in generate_idxs species_to_pe = get_species_to_pe(self.args.protein_embeddings_dir) File "/tscc/nfs/home/ydoctor/datasets/UCE/data_proc/data_utils.py", line 263, in get_species_to_pe species_to_pe = { File "/tscc/nfs/home/ydoctor/datasets/UCE/data_proc/data_utils.py", line 264, in species:torch.load(pe_dir) for species, pe_dir in embeddings_paths.items() File "/opt/conda/lib/python3.10/site-packages/torch/serialization.py", line 771, in load with _open_file_like(f, 'rb') as opened_file: File "/opt/conda/lib/python3.10/site-packages/torch/serialization.py", line 270, in _open_file_like return _open_file(name_or_buffer, mode) File "/opt/conda/lib/python3.10/site-packages/torch/serialization.py", line 251, in init super(_open_file, self).init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: '/dfs/project/cross-species/yanay/code/uce_code/UCE_public/model_files/protein_embeddings/Gallus_gallus.bGalGal1.mat.broiler.GRCg7b.pep.all.gene_symbol_to_embedding_ESM2.pt'

y-doctor commented 6 months ago

Also, here is the command I was running: singularity exec --nv /tscc/nfs/home/ydoctor/containers/data_science_box.sif python /tscc/nfs/home/ydoctor/datasets/UCE/eval_single_anndata.py --adata_path /tscc/nfs/home/ydoctor/datasets/CM4AI/Perturb-seq_10_26_23_RNA_Only.h5ad --dir /tscc/nfs/home/ydoctor/datasets/CM4AI/UCE_Outputs/ --species human --model_loc /tscc/nfs/home/ydoctor/datasets/UCE/model_weights/UCE_params_33_layer.torch --batch_size 8 --nlayers 33

If you could point me towards what I might be able to do to fix this that would be great. Thanks so much!

y-doctor commented 6 months ago

I resolved it by commenting out these lines

image Screen Shot 2023-12-21 at 12 01 39 PM

I think what happened was latest merge had something to do with testing out incorporating a new species (chicken) but the ESM embeddings are saved locally so it breaks, would maybe consider just commenting these lines out and the 'extra_species' lines above them

Yanay1 commented 6 months ago

I think you can also just remove the line in the CSV file for protein embeddings. I've removed it on the main branch now. Thanks for pointing this out!