snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.89k stars 398 forks source link

PCQM4M-LSC AttributeError: 'NoneType' object has no attribute 'GetAtoms' #133

Closed storyandwine closed 3 years ago

storyandwine commented 3 years ago

Hi,

from ogb.utils import smiles2graph

# if you use Pytorch Geometric (requires torch_geometric to be installed)
from ogb.lsc import PygPCQM4MDataset
pyg_dataset = PygPCQM4MDataset(root = ROOT, smiles2graph = smiles2graph)

The error item_id is 1868736 ('C123C45C61C24C356', 2.5551490561950008)

graph_obj = smiles2graph('C123C45C61C24C356')

At first I thought the downloaded file was corrupted, but after repeating it several times there was still an error. Thank you in advance.

weihua916 commented 3 years ago

Hi!

It works fine with my environment. My rdkit version is 2019.03.1. Can you try the same (or newer) rdkit version?

>>> from ogb.utils import smiles2graph
>>> graph_obj = smiles2graph('C123C45C61C24C356')
[08:07:59] WARNING: could not find number of expected rings. Switching to an approximate ring finding algorithm.
>>> graph_obj
{'edge_index': array([[0, 1, 1, 2, 2, 3, 3, 4, 2, 0, 3, 0, 4, 0, 3, 1, 4, 1, 4, 2],
       [1, 0, 2, 1, 3, 2, 4, 3, 0, 2, 0, 3, 0, 4, 1, 3, 1, 4, 2, 4]]), 'edge_feat': array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]]), 'node_feat': array([[5, 0, 4, 5, 0, 0, 2, 0, 0],
       [5, 0, 4, 5, 0, 0, 2, 0, 0],
       [5, 0, 4, 5, 0, 0, 2, 0, 0],
       [5, 0, 4, 5, 0, 0, 2, 0, 0],
       [5, 0, 4, 5, 0, 0, 2, 0, 0]]), 'num_nodes': 5}
storyandwine commented 3 years ago

Thank you for your generous help! Updating rdkit works for me.