arangoml / fastgraphml

Given an input graph (ArangoDB or PyG) it generates graph embeddings using Low-Code framework built on top of PyG.
67 stars 8 forks source link

Ability to reuse PYG graph loaded from database to create models #8

Open ArthurKeen opened 2 years ago

ArthurKeen commented 2 years ago

The model definition/instantiation process from graph algorithm will download data from the source database every time you instantiate a model. This causes an unnecessary time delay when you are testing model variations, i.e., variations of the hyper-parameters. for example, this commonly used fragment of code will reload the data from the database if you were to run it multiple times:

model = SAGE(db, arango_graph, metagraph, embedding_size=64) # define graph embedding model model._train(model, epochs=10) # train

Imagine you wanted to test whether increasing embedding sizes improves model performance (we could use hyper-parameter optimization). You would want it to import the data once from the data source and then keep re-using the local graph object

for i in range(0,5): model[i] = SAGE(db, arango_graph, metagraph, embedding_size=pow(2, i+5), reuse_data=true)) model[i]._train(model[i], epochs=10) # train ...

sachinsharma9780 commented 1 year ago

depends on PyG adapter so will be added in future iterations