lmcinnes / umap

Uniform Manifold Approximation and Projection
BSD 3-Clause "New" or "Revised" License
7.41k stars 805 forks source link

How to look at the loss value? #100

Open asanakoy opened 6 years ago

asanakoy commented 6 years ago

Is there a possibility to print the loss value during the optimization process? Or at least after the last epoch?

It would make sense to add this feature, as it would help to set an appropriate number of epochs when one tunes the algorithm for specific data.

vseledkin commented 6 years ago

Probably this wont be easy because of negative sampling, it is endless sourse of gradient proportional to current learning rate amplitude

lmcinnes commented 6 years ago

@vseledkin is correct, the negative sampling is a method used to avoid ever actually computing the full loss value which is remarkably expensive. I did have a function that could be used to compute loss, but in practice it only scaled to datasets of a few thousand points, so I never included it.

asanakoy commented 6 years ago

@lmcinnes but how can we estimate that the method has converged?

lmcinnes commented 6 years ago

Convergence is not checked - -instead the optimization is run fora specified number of epochs.

On Thu, Aug 2, 2018 at 12:58 PM Artsiom notifications@github.com wrote:

@lmcinnes https://github.com/lmcinnes but how can we estimate that the method has converged?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lmcinnes/umap/issues/100#issuecomment-409996476, or mute the thread https://github.com/notifications/unsubscribe-auth/ALaKBScVUNEE0oqbcmtZUOzYkAfRyAt4ks5uMy-YgaJpZM4VsNWO .

asanakoy commented 6 years ago

But how can I make sure that 300 epoch is not better than 200 epochs? If we had a loss function or at least the gradient norms, than we could infer how many epochs is enough.

sleighsoft commented 5 years ago

Contributions welcome :)

AndLen commented 3 years ago

@vseledkin is correct, the negative sampling is a method used to avoid ever actually computing the full loss value which is remarkably expensive. I did have a function that could be used to compute loss, but in practice it only scaled to datasets of a few thousand points, so I never included it.

I don't suppose you would be willing to share it? I'm looking at applications of UMAP on reasonably small data, where having the loss function explicitly is every useful.

csinva commented 2 years ago

+1

timsainb commented 2 years ago

The Parametric UMAP submodule has a non-parametric module that saves loss. See the bottom figure in this notebook.

https://github.com/lmcinnes/umap/blob/master/notebooks/Parametric_UMAP/06.0-nonparametric-umap.ipynb