GT4SD / gt4sd-core

GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.
https://gt4sd.github.io/gt4sd-core/
MIT License
336 stars 74 forks source link

issue when running jupyter visum-2022-handson-generative-models in GPU #225

Closed itaijj closed 1 year ago

itaijj commented 1 year ago

Describe the bug When trying to calculate get_toxicty based on the jupyter notebook demo visum-2022-handson-generative-models.ipynb, when setting disable_gpu =False in first code snippet, I get the error of some inner data structure not transferred to GPU device inside the paccmann module code, trying to move the molecules or paccmann model on the jupyter notebook code was not possible, so I'm guessing it is a bug.

To Reproduce Steps to reproduce the behaviour:

  1. Go to https://github.com/GT4SD/gt4sd-core/blob/ea9c299cea6ea6c63696b4e9ee49ceb16fb58507/notebooks/visum-2022-handson-generative-models.ipynb
  2. Click on first code snippet and change disable_gpu =False
  3. Run all code snippets until the on Evaluating the generated molecules
  4. See error

Expected behavior Run does not crashes and toxicity is calculated on gpu device

Screenshots

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/gt4sd/a │
│ lgorithms/core.py:353 in predict                                                                 │
│                                                                                                  │
│   350 │   │   """                                                                                │
│   351 │   │                                                                                      │
│   352 │   │   try:                                                                               │
│ ❱ 353 │   │   │   predicted = self.predictor(input)                                              │
│   354 │   │   except TimeoutError:                                                               │
│   355 │   │   │   detail = (                                                                     │
│   356 │   │   │   │   f"Predicting took longer than maximum ({self.max_runtime} seconds)."       │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/gt4sd/p │
│ roperties/molecules/core.py:815 in informative_model                                             │
│                                                                                                  │
│   812 │   │   # Wrapper to get toxicity-endpoint-level predictions                               │
│   813 │   │   def informative_model(x: SmallMolecule) -> List[PropertyValue]:                    │
│   814 │   │   │   x = to_smiles(x)                                                               │
│ ❱ 815 │   │   │   _ = model(x)                                                                   │
│   816 │   │   │   return model.predictions.detach().tolist()                                     │
│   817 │   │                                                                                      │
│   818 │   │   return informative_model                                                           │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/paccman │
│ n_generator/drug_evaluators/tox21.py:56 in __call__                                              │
│                                                                                                  │
│   53 │   │   │   raise TypeError(f'Input must be String, not :{type(smiles)}')                   │
│   54 │   │                                                                                       │
│   55 │   │   smiles_tensor = self.preprocess_smiles(smiles)                                      │
│ ❱ 56 │   │   return self.tox21_score(smiles_tensor)                                              │
│   57 │                                                                                           │
│   58 │   def tox21_score(self, smiles_tensor: torch.Tensor) -> float:                            │
│   59 │   │   """                                                                                 │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/paccman │
│ n_generator/drug_evaluators/tox21.py:70 in tox21_score                                           │
│                                                                                                  │
│   67 │   │   """                                                                                 │
│   68 │   │                                                                                       │
│   69 │   │   # Test the compound                                                                 │
│ ❱ 70 │   │   predictions, _ = self.model(smiles_tensor)                                          │
│   71 │   │   # To allow accessing the raw predictions from outside                               │
│   72 │   │   self.predictions = predictions[0, :]                                                │
│   73                                                                                             │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/torch/n │
│ n/modules/module.py:1130 in _call_impl                                                           │
│                                                                                                  │
│   1127 │   │   # this function, and just call forward.                                           │
│   1128 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1129 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1130 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1131 │   │   # Do not call functions when jit is used                                          │
│   1132 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1133 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/toxsmi/ │
│ models/mca.py:268 in forward                                                                     │
│                                                                                                  │
│   265 │   │   │   prediction_dict includes the prediction and attention weights.                 │
│   266 │   │   """                                                                                │
│   267 │   │                                                                                      │
│ ❱ 268 │   │   embedded_smiles = self.smiles_embedding(smiles.to(dtype=torch.int64))              │
│   269 │   │                                                                                      │
│   270 │   │   # SMILES Convolutions. Unsqueeze has shape batch_size x 1 x T x H.                 │
│   271 │   │   encoded_smiles = [embedded_smiles] + [                                             │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/torch/n │
│ n/modules/module.py:1130 in _call_impl                                                           │
│                                                                                                  │
│   1127 │   │   # this function, and just call forward.                                           │
│   1128 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1129 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1130 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1131 │   │   # Do not call functions when jit is used                                          │
│   1132 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1133 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/torch/n │
│ n/modules/sparse.py:158 in forward                                                               │
│                                                                                                  │
│   155 │   │   │   │   self.weight[self.padding_idx].fill_(0)                                     │
│   156 │                                                                                          │
│   157 │   def forward(self, input: Tensor) -> Tensor:                                            │
│ ❱ 158 │   │   return F.embedding(                                                                │
│   159 │   │   │   input, self.weight, self.padding_idx, self.max_norm,                           │
│   160 │   │   │   self.norm_type, self.scale_grad_by_freq, self.sparse)                          │
│   161                                                                                            │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/torch/n │
│ n/functional.py:2199 in embedding                                                                │
│                                                                                                  │
│   2196 │   │   #   torch.embedding_renorm_                                                       │
│   2197 │   │   # remove once script supports set_grad_enabled                                    │
│   2198 │   │   _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)                    │
│ ❱ 2199 │   return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)        │
│   2200                                                                                           │
│   2201                                                                                           │
│   2202 def embedding_bag(                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when 
checking argument for argument index in method wrapper__index_select)

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:16                                                                                   │
│                                                                                                  │
│   13 │   solubility.append(get_solubility(mol))                                                  │
│   14 │   synthesizability.append(get_synthesizability(mol))                                      │
│   15 │   molecular_weight.append(get_molecular_weight(mol))                                      │
│ ❱ 16 │   toxicity.append(get_toxicity(molecule))                                                 │
│   17                                                                                             │
│   18 affinity = get_affinity(molecules)                                                          │
│   19                                                                                             │
│                                                                                                  │
│ in <lambda>:11                                                                                   │
│                                                                                                  │
│    8 get_synthesizability = PropertyPredictorRegistry.get_property_predictor('scscore')          │
│    9 get_molecular_weight = PropertyPredictorRegistry.get_property_predictor('molecular_weigh    │
│   10 toxicity_fn = PropertyPredictorRegistry.get_property_predictor('tox21', {'algorithm_vers    │
│ ❱ 11 get_toxicity = lambda x: np.mean(toxicity_fn(x))                                            │
│   12                                                                                             │
│   13 def get_affinity(strings):                                                                  │
│   14 │   l = len(strings)                                                                        │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/gt4sd/a │
│ lgorithms/core.py:365 in __call__                                                                │
│                                                                                                  │
│   362 │                                                                                          │
│   363 │   def __call__(self, input: Any) -> Any:                                                 │
│   364 │   │   """Alias for `self.predict`."""                                                    │
│ ❱ 365 │   │   return self.predict(input)                                                         │
│   366                                                                                            │
│   367                                                                                            │
│   368 @dataclass                                                                                 │
│                                                                                                  │
│ /projects/msieve_dev3/usr/itaiguez/envs_miniconda/envs/gt4sd/lib/python3.8/site-packages/gt4sd/a │
│ lgorithms/core.py:360 in predict                                                                 │
│                                                                                                  │
│   357 │   │   │   )                                                                              │
│   358 │   │   │   logger.warning(detail + " Exiting now!")                                       │
│   359 │   │   except Exception:                                                                  │
│ ❱ 360 │   │   │   raise Exception(f"{self.__class__.__name__} failed with {input}")              │
│   361 │   │   return predicted                                                                   │
│   362 │                                                                                          │
│   363 │   def __call__(self, input: Any) -> Any:                                                 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Exception: Tox21 failed with COC(=O)CNC=C(C(c1ccccc1)C(=O)NCc1cccnc1NS(=O))
Out of the 50 generated molecules, 
jannisborn commented 1 year ago

Hey, thanks for reporting this, can you confirm that it runs fine when you set disable_gpu=True? I want to make sure there is no concurrent source of error.

itaijj commented 1 year ago

yes, it works fine when disable_gpu=True

jannisborn commented 1 year ago

Hi @itaijj,

The issue should be fixed. If you recreate the conda env, it should work. To fix your current env, please run:

pip uninstall toxsmi
pip uninstall paccmann_predictor
pip install -r vcs_requirements.txt

I pushed fixes to both toxsmi and paccmann_predictor. The problem was that the models were not fully, but only partly casted to the correct device