facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.29k stars 644 forks source link

Loading Embeddings error Supervised variant prediction with ESM #614

Open CwilsonBroad opened 1 year ago

CwilsonBroad commented 1 year ago

Hello, I'm running through the supervised variant prediction tutorial before I try my own data and I'm running into an error during the Embedding loading step.

* code ys = [] Xs = [] for header, _seq in esm.data.read_fasta(FASTA_PATH): scaled_effect = header.split('|')[-1] ys.append(float(scaled_effect)) fn = f'{EMB_PATH}/{header[1:]}.pt' embs = torch.load(fn) Xs.append(embs['mean_representations'][EMB_LAYER]) Xs = torch.stack(Xs, dim=0).numpy() print(len(ys)) print(Xs.shape)

*error

FileNotFoundError Traceback (most recent call last) in <cell line: 3>() 5 ys.append(float(scaled_effect)) 6 fn = f'{EMB_PATH}/{header[1:]}.pt' ----> 7 embs = torch.load(fn) 8 Xs.append(embs['mean_representations'][EMB_LAYER]) 9 Xs = torch.stack(Xs, dim=0).numpy()

2 frames /usr/local/lib/python3.10/dist-packages/torch/serialization.py in init(self, name, mode) 250 class _open_file(_opener): 251 def init(self, name, mode): --> 252 super().init(open(name, mode)) 253 254 def exit(self, *args):

FileNotFoundError: [Errno 2] No such file or directory: './P62593_reprs/|beta-lactamase_P20P|1.581033423.pt'

smruti241 commented 11 months ago

@CwilsonBroad I am getting the same error. Can you please tell me if you solve this issue?

@tomsercu , @joshim5, @rmrao , @naailkhan28 , @liujas000 , @nikitos9000 , @ebetica , @chloechsu , @YaoYinYing

smruti241 commented 11 months ago

I tried with the given data also, but gave error

FileNotFoundError Traceback (most recent call last) /raid/home/smrutip/smruti_project/supervised_prediction.ipynb Cell 18 line 7 5 ys.append(float(scaled_effect)) 6 fn = f'{EMB_PATH}/{header[1:]}.pt' ----> 7 embs = torch.load(fn) 8 Xs.append(embs['mean_representations'][EMB_LAYER]) 9 Xs = torch.stack(Xs, dim=0).numpy()

File ~/anaconda3/envs/genslm/lib/python3.9/site-packages/torch/serialization.py:771, in load(f, map_location, pickle_module, weights_only, **pickle_load_args) 768 if 'encoding' not in pickle_load_args.keys(): 769 pickle_load_args['encoding'] = 'utf-8' --> 771 with _open_file_like(f, 'rb') as opened_file: 772 if _is_zipfile(opened_file): 773 # The zipfile reader is going to advance the current file position. 774 # If we want to actually tail call to torch.jit.load, we need to 775 # reset back to the original position. 776 orig_position = opened_file.tell()

File ~/anaconda3/envs/genslm/lib/python3.9/site-packages/torch/serialization.py:270, in _open_file_like(name_or_buffer, mode) 268 def _open_file_like(name_or_buffer, mode): 269 if _is_path(name_or_buffer): --> 270 return _open_file(name_or_buffer, mode) 271 else: 272 if 'w' in mode: ... File ~/anaconda3/envs/genslm/lib/python3.9/site-packages/torch/serialization.py:251, in _open_file.init(self, name, mode) 250 def init(self, name, mode): --> 251 super(_open_file, self).init(open(name, mode))

FileNotFoundError: [Errno 2] No such file or directory: '/raid/home/smrutip/smruti_project/esm-variants/esm1b_analysis/genes/orf1ab/testing/|beta-lactamase_P20P|1.581033423.pt'