Understanding the Process for encoding dataset without Labels

In our work, we assume that we are given a graph that connects the data points with edge values representing some (possibly non-Euclidean) distance or similarity measure. The purpose of our algorithm, Graph-Geometry Preserving Autoencoders (GGAE), is to find a low-dimensional latent representation that preserves the distance along that graph. However, when encoding a dataset without any labels, there is no such graph representing data distances. A reasonable choice in this case is to construct k-nearest neighbor graph using Euclidean distances (which can be calculated without any labels) and then apply GGAE to learn a latent representation that preserves the (local) Euclidean distances between data points. In fact, this is what we did with our Swiss Roll example.

JungbinLim / GGAE-public

Understanding the Process for encoding dataset without Labels #1