Looking for most efficient way to run transform()

lmcinnes / umap

Uniform Manifold Approximation and Projection

BSD 3-Clause "New" or "Revised" License

7.46k stars 809 forks source link

Looking for most efficient way to run transform() #1049

Open Alex-EEE opened 1 year ago

Alex-EEE commented 1 year ago

We're storing the results of umap.fit by pickling the umap object. I see this saves all the training ("raw") data, which is creating files close to 1 GB for our training data. Is there a way or mode to only save something much smaller, like just the embedding, or just the reduced training data, etc, to enable calling transform() with new samples?