cj2001 / neo4j-gds-book

0 stars 0 forks source link

Doc2Vec + graphSAGE + Classifier #4

Open tomasonjo opened 3 years ago

tomasonjo commented 3 years ago

The idea is similar to what they did in the original graphSAGE paper. You start with some articles and their relationships. They used reddit posts and same author commented relationships. In another example they used scientific articles and citation network. Maybe the coauthorship network would also work. We could use the arXiv dataset or maybe https://www.aminer.org/citation dataset.

The idea is to fetch doc embeddings from text, run those embeddings through graphSAGE and finally use the graphSAGE embeddings for a ML model

cj2001 commented 3 years ago

I am curious what your thoughts are on the benefit of using ML on a knowledge graph via graph embeddings? I guess it depends on what problem we are trying to solve. So in #1 we were going to assemble the knowledge graph based on NLP. But what would we be attempting to do by running graphSAGE on that and putting those embeddings into an ML model?

tomasonjo commented 3 years ago

Idea is to improve the ML model accuracy :) If you want to predict an article category for example... you could use doc embeddings in combination with graph embeddings -> something in the style of https://towardsdatascience.com/using-graphsage-embeddings-for-downstream-classification-model-4492e01ae54e