pavlin-policar / openTSNE

Extensible, parallel implementations of t-SNE
https://opentsne.rtfd.io
BSD 3-Clause "New" or "Revised" License
1.44k stars 158 forks source link

negative_gradient_method='auto' #179

Closed dkobak closed 3 years ago

dkobak commented 3 years ago

I think we discussed it a few times already without reaching any decision, but people keep asking me why openTSNE is so much slower than sklearnTSNE, and it turns out they have super small sample size of maybe n~100 or so.

I suggest defaulting to negative_gradient_method='auto' with some very simple heuristic, like n<1000 uses 'bh' and n>=1000 uses 'fft'. The optimal threshold depends on the hardware, number of threads, etc, so 1000 is not going to be optimal, but I think it's maybe better than using FFT always. What do you think?

This post is motivated by this message I got now:

The same code needs ~25s with opentsne but only ~3s with sklearn.

Setting it to BH solved it.

pavlin-policar commented 3 years ago

Yeah, I've been thinking of adding this as well in this new range of updates. But I was thinking more along the 10k mark for the threshold. In my very limited experiments (I just ran the macosko data set a few times), 10k seemed like the case where maybe bh takes about as long as fft.

dkobak commented 3 years ago

FWIW here is what I get on my laptop:

bh-vs-fft

So for me the crossing point is around 5000 for 1 thread and around 15000 for 8 threads.

pavlin-policar commented 3 years ago

Heh, 10k seems like a good middle ground then.

dkobak commented 3 years ago

Sounds good.