DeepGraphLearning / torchdrug

A powerful and flexible machine learning platform for drug discovery
https://torchdrug.ai/
Apache License 2.0
1.42k stars 200 forks source link

[Bug] `lazy=True` argument is not working in case of `USPTO50k` dataset #120

Open bhadreshpsavani opened 2 years ago

bhadreshpsavani commented 2 years ago
from torchdrug import datasets, data

reaction_dataset = datasets.USPTO50k("~/molecule-datasets/",
                                     lazy=True,
                                     node_feature="center_identification",
                                     kekulize=True, )
synthon_dataset = datasets.USPTO50k("~/molecule-datasets/", as_synthon=True,
                                    node_feature="synthon_completion",
                                    lazy=True,
                                    kekulize=True)

will give below error

TypeError                                 Traceback (most recent call last)
[<ipython-input-6-5448f605ee48>](https://localhost:8080/#) in <module>()
      4                                      lazy=True,
      5                                      node_feature="center_identification",
----> 6                                      kekulize=True, )
      7 synthon_dataset = datasets.USPTO50k("~/molecule-datasets/", as_synthon=True,
      8                                     node_feature="synthon_completion",

<decorator-gen-225> in __init__(self, path, as_synthon, verbose, **kwargs)

4 frames
[/usr/local/lib/python3.7/dist-packages/torchdrug/core/core.py](https://localhost:8080/#) in wrapper(init, self, *args, **kwargs)
    286                 config.pop(k)
    287             self._config = dict(config)
--> 288             return init(self, *args, **kwargs)
    289 
    290         def get_function(method):

[/usr/local/lib/python3.7/dist-packages/torchdrug/datasets/uspto50k.py](https://localhost:8080/#) in __init__(self, path, as_synthon, verbose, **kwargs)
     61 
     62         self.load_csv(file_name, smiles_field="rxn_smiles", target_fields=self.target_fields, verbose=verbose,
---> 63                       **kwargs)
     64 
     65         if as_synthon:

[/usr/local/lib/python3.7/dist-packages/torchdrug/data/dataset.py](https://localhost:8080/#) in load_csv(self, csv_file, smiles_field, target_fields, verbose, **kwargs)
    111                         targets[field].append(value)
    112 
--> 113         self.load_smiles(smiles, targets, verbose=verbose, **kwargs)
    114 
    115     def _standarize_index(self, index, count):

[/usr/local/lib/python3.7/dist-packages/torchdrug/data/dataset.py](https://localhost:8080/#) in load_smiles(self, smiles_list, targets, transform, verbose, **kwargs)
    250                     logger.debug("Can't construct molecule from SMILES `%s`. Ignore this sample." % _smiles)
    251                     break
--> 252                 mol = data.Molecule.from_molecule(mol, **kwargs)
    253                 mols.append(mol)
    254             else:

[/usr/local/lib/python3.7/dist-packages/torchdrug/utils/decorator.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
    113                     kwargs[value] = kwargs.pop(key)
    114 
--> 115             return func(*args, **kwargs)
    116 
    117         return wrapper

TypeError: from_molecule() got an unexpected keyword argument 'lazy'
DimGorr commented 2 years ago

Where did you find that code? I see the one here https://torchdrug.ai/docs/tutorials/retrosynthesis.html#prepare-the-dataset but it is different and has no argument lazy at all. Moreover, the code from the link worked fine for me:)

DimGorr commented 2 years ago

seems like it has already been solved https://github.com/DeepGraphLearning/torchdrug/pull/24

bhadreshpsavani commented 2 years ago

Cool! Actually, i was trying it in colab and thought of using this argument! When we check the arguments like this,

datasets.USPTO50k?

it shows that lazy as an optional arguments but it was giving error

bhadreshpsavani commented 2 years ago

Hi @DimGorr, It shows this docstring

Init signature: datasets.USPTO50k(*args, **kwargs)
Docstring:     
USPTO50k(path, as_synthon=False, verbose=1, transform=None, lazy=False, atom_feature='default', bond_feature='default', mol_feature=None, with_hydrogen=False, kekulize=False)
KiddoZhu commented 2 years ago

Hi! The lazy operation isn't implemented for USPTO50k. The docstring is automatically generated due to its inheritance from data.MoleculeDataset.