The model definition/instantiation process from graph algorithm will download data from the source database every time you instantiate a model. This causes an unnecessary time delay when you are testing model variations, i.e., variations of the hyper-parameters. for example, this commonly used fragment of code will reload the data from the database if you were to run it multiple times:
model = SAGE(db, arango_graph, metagraph, embedding_size=64) # define graph embedding model
model._train(model, epochs=10) # train
Imagine you wanted to test whether increasing embedding sizes improves model performance (we could use hyper-parameter optimization). You would want it to import the data once from the data source and then keep re-using the local graph object
for i in range(0,5):
model[i] = SAGE(db, arango_graph, metagraph, embedding_size=pow(2, i+5), reuse_data=true))
model[i]._train(model[i], epochs=10) # train
...
The model definition/instantiation process from graph algorithm will download data from the source database every time you instantiate a model. This causes an unnecessary time delay when you are testing model variations, i.e., variations of the hyper-parameters. for example, this commonly used fragment of code will reload the data from the database if you were to run it multiple times:
model = SAGE(db, arango_graph, metagraph, embedding_size=64) # define graph embedding model model._train(model, epochs=10) # train
Imagine you wanted to test whether increasing embedding sizes improves model performance (we could use hyper-parameter optimization). You would want it to import the data once from the data source and then keep re-using the local graph object
for i in range(0,5): model[i] = SAGE(db, arango_graph, metagraph, embedding_size=pow(2, i+5), reuse_data=true)) model[i]._train(model[i], epochs=10) # train ...