Ability to reuse PYG graph loaded from database to create models

The model definition/instantiation process from graph algorithm will download data from the source database every time you instantiate a model. This causes an unnecessary time delay when you are testing model variations, i.e., variations of the hyper-parameters. for example, this commonly used fragment of code will reload the data from the database if you were to run it multiple times:

model = SAGE(db, arango_graph, metagraph, embedding_size=64) # define graph embedding model model._train(model, epochs=10) # train

Imagine you wanted to test whether increasing embedding sizes improves model performance (we could use hyper-parameter optimization). You would want it to import the data once from the data source and then keep re-using the local graph object

for i in range(0,5): model[i] = SAGE(db, arango_graph, metagraph, embedding_size=pow(2, i+5), reuse_data=true)) model[i]._train(model[i], epochs=10) # train ...

arangoml / fastgraphml

Ability to reuse PYG graph loaded from database to create models #8