Closed aa-samad closed 1 year ago
Shouldn't be too hard. Iterate through the rows (or columns) of the distance matrix, and for each observation, just find the k
closest neighbors (excluding self), where k
is defined using the qdtsne::perplexity_to_k
function. Then you can just construct a NeighborList
object and feed that to Tsne::run
.
Note that this will still apply the Barnes-Hut approximations. Technically, if the dataset is small enough to allow creation of a distance matrix, you could do an exact t-SNE, but this is not supported by the qdtsne library.
Thanks a lot for the comment. In case anyone else wonders what I did:
int N = 100;
auto Y = qdtsne::initialize_random(N);
int K = tsne_perplexity * 3;
// ----- construct knn tree from distance matrix
std::vector<std::vector<std::pair<int, double>>> knn_list(N);
std::vector<std::pair<int, double>> k_least_idx(K);
for (int i=0; i < N; i++)
{
k_least_idx.assign(k_least_idx.size(), std::make_pair(0, 100));
for(int j=0; j < N; j++)
{
if (i == j)
{
continue;
}
for(int k=0; k < K; k++)
{
if (similarity_matrix[i*N + j] < k_least_idx[k].second)
{
k_least_idx[k].second = similarity_matrix[i*N + j];
k_least_idx[k].first = j;
}
}
}
knn_list[i] = k_least_idx;
}
// ------------ solve the low dim embedd from KNN tree
qdtsne::Tsne tsne;
tsne.set_num_threads(16);
tsne.set_perplexity(tsne_perplexity);
tsne.set_max_iter(tsne_num_iter);
tsne.set_eta(tsne_learning_rate);
tsne.set_exaggeration_factor(tsne_early_exaggeration);
auto ref = tsne.run(knn_list, Y.data());
Hi, Thank you for creating this fantastic Repo. Can we calculate TSNE embedding from a pre-computed distance matrix? If yes, please elaborate on how to do it. Thanks in advance.