Closed Matin-Macktoobian closed 2 years ago
You need to install torch, but otherwise you can use ogb with tensorflow
Thanks. I installed torch via pip, but I get the following file-driven error:
Using backend: pytorch
Traceback (most recent call last):
File "~/PycharmProjects/RL/GNN_spektral_OGB.py", line 13, in <module>
dataset = NodePropPredDataset("ogbn-proteins")
File "~\AppData\Local\Programs\Python\Python37\lib\site-packages\ogb\nodeproppred\dataset.py", line 63, in __init__
self.pre_process()
File "~\AppData\Local\Programs\Python\Python37\lib\site-packages\ogb\nodeproppred\dataset.py", line 70, in pre_process
loaded_dict = torch.load(pre_processed_file_path)
File "~\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\serialization.py", line 608, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "~\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\serialization.py", line 777, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
EOFError: Ran out of input
I see. Perhaps, you will need to delete the downloaded folder and download/preprocess from scratch.
Can you please explain what downloaded folder you mean? If you mean that of torch, I did but the result is as faulty as I reported.
I meant dataset/ogbn_proteins.
The dataset folder downloaded by ogb package.
I just did, but the error persists to exist as before. Then, I tried to change the dataset checking whether it may be a problem of ogbn_proteins. This time, I used ogbg-molhiv for a graph level process. Then, I got the following error
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.5\helpers\pydev\pydevd.py", line 1664, in <module>
main()
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.5\helpers\pydev\pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.5\helpers\pydev\pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.5\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "~/PycharmProjects/RL/GNN_spektral_OGB.py", line 27, in <module>
model = GeneralGNN(dataset.labels, activation="softmax")
File "~AppData\Local\Programs\Python\Python37\lib\site-packages\spektral\models\general_gnn.py", line 158, in __init__
activation,
File "~\AppData\Local\Programs\Python\Python37\lib\site-packages\spektral\models\general_gnn.py", line 216, in __init__
self.mlp.add(Dense(hidden if i < layers - 1 else output))
File "~\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\keras\layers\core.py", line 1166, in __init__
self.units = int(units) if not isinstance(units, int) else units
TypeError: only size-1 arrays can be converted to Python scalars
raised by the code below.
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.losses import CategoricalCrossentropy
from tensorflow.keras.metrics import categorical_accuracy
from tensorflow.keras.optimizers import Adam
from spektral.data import DisjointLoader
from spektral.models import GeneralGNN
from ogb.graphproppred import GraphPropPredDataset
dataset = GraphPropPredDataset(name="ogbg-molhiv")
split_idx = dataset.get_idx_split()
train_idx, valid_idx, test_idx = split_idx["train"], split_idx["valid"], split_idx["test"]
np.random.seed(0)
batch_size = 16
learning_rate = 0.0001
epochs = 100
loader_tr = DisjointLoader(train_idx, batch_size=batch_size, epochs=epochs)
loader_te = DisjointLoader(test_idx, batch_size=batch_size)
model = GeneralGNN(dataset.labels, activation="softmax")
optimizer = Adam(learning_rate)
loss_fn = CategoricalCrossentropy()
model.compile(loss=loss_fn,
optimizer=optimizer,
metrics=categorical_accuracy)
history = model.fit(loader_tr.load(), steps_per_epoch=loader_te.steps_per_epoch, epochs=epochs)
plt.plot(history.history['loss'])
plt.plot(history.history['categorical_accuracy'])
plt.xlabel('epoch')
plt.legend(["Loss", "Categorical Accuracy"])
Interesting. I tried those datasets and it worked fine on my end. I was initially thinking that your dataset file was corrupted. I do not have clue..
>>> from ogb.nodeproppred import NodePropPredDataset
>>> dataset = NodePropPredDataset("ogbn-proteins")
Downloading http://snap.stanford.edu/ogb/data/nodeproppred/proteins.zip
Downloaded 0.21 GB: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 216/216 [00:02<00:00, 104.00it/s]
Extracting dataset/proteins.zip
Loading necessary files...
This might take a while.
Processing graphs...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00, 8.18s/it]
Saving...
>>> dataset
NodePropPredDataset(1)
>>> from ogb.graphproppred import GraphPropPredDataset
>>>
>>> dataset = GraphPropPredDataset(name="ogbg-molhiv")
Downloading http://snap.stanford.edu/ogb/data/graphproppred/csv_mol_download/hiv.zip
Downloaded 0.00 GB: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 87.36it/s]
Extracting dataset/hiv.zip
Loading necessary files...
This might take a while.
Processing graphs...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 41127/41127 [00:00<00:00, 49756.94it/s]
Saving...
>>> dataset
GraphPropPredDataset(41127)
Below is my environment:
torch==1.9.0
ogb==1.3.1
There is also a recent effort to adding OGB datasets to TensorFlow datasets, see here, but it's far from finished.
Closing this for now. Let us know if it still does not work!
I am working on a big machine learning project in which various features of tensorflow are used. So, while using an ogb dataset for a new graph-based module, I cannot switch to torch. I thought your library-agnostic loader provides a way to incorporate tensorflow, as here one reads
However, I still get the following error
when I run, for example,
Thus, can you please guide me on how to use ogb datasets by tensorflow, instead of torch?
Thanks, Matin