Can the `Johnson–Lindenstrauss lemma` be a replacement for `Spectral embedding`?

The purpose of the spectral initialization is a bit different from a random projection. The spectral initialization has the advantage of:

working on the graph Laplacian derived directly from the fuzzy simplicial set that UMAP attempts to approximate in the low-dimensional space.
as noted by Carreira-Perpiñán and expanded upon by Linderman and Steinerberger there is a strong connection between the spectral embedding and the attractive part of neighbor-embedding methods like UMAP and t-SNE, so by starting with the spectral embedding you get something like the early exaggeration that is used in t-SNE for "free".

Some possible downsides of using a random projection:

the results are about preserving distances (as opposed to similarities between near neighbors) so at least in the typical case of creating a 2D plot from high dimensional data we know that's not going work very well (but there's nothing wrong with having a bound on just how bad it's going to be).
someone with more mathematical sophistication than me will have to chime in here, but the JL lemma talks about Euclidean distances. It is not at all clear to me that if the input metric is not Euclidean, whether something JL lemma-like exists. So for non-Euclidean metrics, the random projection method may not be helpful.

At any rate, it's easy to try this with sklearn:

    import sklearn.random_projection

    # assume your data is in the variable x
    # probably also want to set random_seed
    embedder = sklearn.random_projection.SparseRandomProjection(
        n_components=2
    )
    init_coords = embedder.fit_transform(x)

You can then run UMAP with init=init_coords to use the random projection input as initialization. In my experience, it didn't give results that seemed particularly great. I think I would prefer truncated SVD for this kind of initialization unless speed really was an issue (these gave results I preferred the look of but I did no quantitative evaluation).

One way you could use random projections would be if you had very high dimensional data and wanted to reduce the dimensionality initially before applying UMAP. Shah and Silwal looked at this (in the context of t-SNE), but found that PCA did a better job, so truncated SVD would again be preferred for that. Oh well.

lmcinnes / umap

Can the `Johnson–Lindenstrauss lemma` be a replacement for `Spectral embedding`? #901