facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.26k stars 643 forks source link

Configuring device in Inverse Folding model #374

Closed adrienchaton closed 1 year ago

adrienchaton commented 2 years ago

Hello,

This is related to https://github.com/facebookresearch/esm/pull/317 ; none of the main functionalities of esm_if1_gvp4_t16_142M_UR50 seem to run on GPU. This applies to model.sample ; esm.inverse_folding.util.score_sequence and esm.inverse_folding.util.get_encoder_output.

from copy import deepcopy
import numpy as np
import torch
import torch.nn.functional as F
import torch_geometric
import torch_sparse
from torch_geometric.nn import MessagePassing
import esm
from esm.inverse_folding.util import CoordBatchConverter

cuda_devices = 10
device = torch.device(f"cuda:{cuda_devices}" if (torch.cuda.is_available()) else "cpu")
model, alphabet = esm.pretrained.esm_if1_gvp4_t16_142M_UR50()
model = model.eval()
model.to(device)

pdb_path =  "something.pdb"
structure = esm.inverse_folding.util.load_structure(pdb_path)
coords, native_seq = esm.inverse_folding.util.extract_coords_from_structure(structure)

sampled_seq = model.sample(coords, temperature=1.)
ll_fullseq, ll_withcoord = esm.inverse_folding.util.score_sequence(model, alphabet, coords, native_seq)
rep = esm.inverse_folding.util.get_encoder_output(model, alphabet, coords)
# all 3 trigger "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:10 and cpu! (when checking argument for argument index in method wrapper__index_select)"

model.to(torch.device("cpu"))
sampled_seq = model.sample(coords, temperature=1.)
ll_fullseq, ll_withcoord = esm.inverse_folding.util.score_sequence(model, alphabet, coords, native_seq)
rep = esm.inverse_folding.util.get_encoder_output(model, alphabet, coords)
# no problem

This is a bit unfortunate and can be resolved rather easily ... Do you plan to fix this please ? Or would you like a PR ? If not, is there a specific reason not to run these methods on GPU ? We could do an allclose test and verify that results are consistent.

Thanks !

tomsercu commented 1 year ago

Hi @adrienchaton thanks for pointing that out! We would most definitely welcome a PR to improve this. Also the notebooks could easily incorporate it once it's implemented. Thank you!

naailkhan28 commented 1 year ago

I have my own fork of IF1 with device support in model.sample, util.score_sequence, multichain_util.score_sequence_in_complex, and multichain_util.ssample_sequence_in_complex. Happy to get a PR drafted for this tomorrow!

adrienchaton commented 1 year ago

@naailkhan28 I let you do the PR, let me know if I can give a hand also it would be good to fix get_encoder_output and get_encoder_output_for_complex at the same time because there too the device is not configured

naailkhan28 commented 1 year ago

Hi all, sorry for the delays - PR #386 should address this, with device support for model.sample as well as using the model device for get_encoder_output and multichain sample too.