Question about graph embedding spaces

Knowledge-Graph-Hub / kg-phenio

A graph for accessing and comparing knowledge concerning phenotypes across species and genetic backgrounds.

BSD 3-Clause "New" or "Revised" License

4 stars 4 forks source link

Question about graph embedding spaces #53

Open matentzn opened 2 years ago

matentzn commented 2 years ago

Hey all, working on our RPPR aims and I stumbled across the following question. Lets say we have this graph here:

I am assuming now that in KG-embedding space (word and graph embeddings), all three species-specific eyes have the same distance from each other - correct me if this assumption is wrong.

If I now were to layer on top this graph another graph:

Am I correct in assuming that this would result in Monkey and Human eyes being closer in embedding space than Human and Bird eyes? Thank you for you help :)

LucaCappelletti94 commented 2 years ago

Yes, generally speaking that should hold true for most topological embedding methods. We can test this experimentally rather easily, I'll try it out later this afternoon.

LucaCappelletti94 commented 2 years ago

Should I close this issue?

matentzn commented 2 years ago

Did you verify this experimentally, can you share the cosine similarity scores before and after?

LucaCappelletti94 commented 2 years ago

Ah yes, sorry my brain skipped a bit. On it!

LucaCappelletti94 commented 2 years ago

Ok so, so after having run a wide range of embeddings, I can say that there is no statistical difference.

More specifically:

I have run SkipGram, CBOW, GloVe and TransE from Ensmallen
I have run TransE and BoxE from PyKeen

In the small original graph, the nodes are all equally far apart. In the second graph, the cosine similarities change but not in a constant manner. In some runs and embeddings, the bird eye is closer than the human or monkey eye. I think it boils down to the fact that the three nodes have the same minimum path between each other, and that is what influences most the embedding, more than any other connection can.

matentzn commented 2 years ago

Ahhhh very unfortunate! :/ I was hoping to be able to trick graph embeddings into being able to recognise taxonomic distance.. THANK you for checking.

caufieldjh commented 2 years ago

What if we embed without the edges between "(species) eye" and "metazoan eye"?

matentzn commented 2 years ago

I think this would be a big hack.. The edges do matter when you have a massive ontology like uPheno.. Maybe they would decrease in importance if the example was somewhat larger..