JungbinLim / GGAE-public

11 stars 2 forks source link

Understanding the Process for encoding dataset without Labels #1

Open rahulmoorthy19 opened 3 months ago

rahulmoorthy19 commented 3 months ago

Firstly, thank you for the great work and for sharing the codebase for the same. I wanted to you your approach for encoding an adjacency matrix of a graph. All the implementations provided assume there are labels available but in my scenario there are no labels available at all as I would like to encode the matrix and decode the same again. So was wondering if the current codebase can be used for the same. It would be really helpful, if you could also let me know on a high level on where the changes needs to be done.

Thank you!

JungbinLim commented 2 months ago

In our work, we assume that we are given a graph that connects the data points with edge values representing some (possibly non-Euclidean) distance or similarity measure. The purpose of our algorithm, Graph-Geometry Preserving Autoencoders (GGAE), is to find a low-dimensional latent representation that preserves the distance along that graph. However, when encoding a dataset without any labels, there is no such graph representing data distances. A reasonable choice in this case is to construct k-nearest neighbor graph using Euclidean distances (which can be calculated without any labels) and then apply GGAE to learn a latent representation that preserves the (local) Euclidean distances between data points. In fact, this is what we did with our Swiss Roll example.