Genentech / gReLU

gReLU is a python library to train, interpret, and apply deep learning models to DNA sequences.
https://genentech.github.io/gReLU/
MIT License
228 stars 23 forks source link

Unexpected device for model. embed_on_dataset #19

Closed dagarfield closed 4 months ago

dagarfield commented 4 months ago

Encountering the following issue

dataset = grelu.data.dataset.SeqDataset(df.seq.to_list())
embeddings = binary_model.embed_on_dataset(dataset, devices = [0], num_workers=7)

Leads to the following error

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[35], line 1
----> 1 embeddings = binary_model.embed_on_dataset(dataset, devices = [0], num_workers=7)

File ~/.conda/envs/gRelu_v1/lib/python3.10/site-packages/grelu/lightning/__init__.py:847, in LightningModel.embed_on_dataset(self, dataset, devices, num_workers, batch_size)
    843     device = device[0]
    844     warnings.warn(
    845         f"embed_on_dataset currently only uses a single GPU: {device}"
    846     )
--> 847 self.to(device)
    849 # Get embeddings
    850 preds = []

File ~/.conda/envs/gRelu_v1/lib/python3.10/site-packages/lightning_fabric/utilities/device_dtype_mixin.py:53, in _DeviceDtypeModuleMixin.to(self, *args, **kwargs)
     51 """See :meth:`torch.nn.Module.to`."""
     52 # this converts `str` device to `torch.device`
---> 53 device, dtype = torch._C._nn._parse_to(*args, **kwargs)[:2]
     54 _update_properties(self, device=device, dtype=dtype)
     55 return super().to(*args, **kwargs)

RuntimeError: Expected one of cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, fpga, ort, xla, lazy, vulkan, mps, meta, hpu, mtia, privateuseone device type at start of device string: gpu

The snippet runs fine (if slowly) if you remove the reference to a specific device and just let it default to using the CPU

avantikalal commented 4 months ago

For now, you can get around this by supplying devices='cuda:0'

dagarfield commented 4 months ago

No dice -- it give the same error. Letting it default to 'cpu' (or putting that in for devices) works fine for now (just slow)

avantikalal commented 4 months ago

Resolved by #20