scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.92k stars 600 forks source link

turn on the existing `n_iter=` parameter in multicore tsne #1150

Open jfx319 opened 4 years ago

jfx319 commented 4 years ago

t-SNE is an interative algorithm, and takes numerous iterations to converge, particularly on larger datasets. For example, if MulticoreTSNE is installed, it accepts n_iter=30000 as opposed to the default n_iter=1000. It would be nice to have this parameter exposed. For larger datasets of say 200K cells, 1000 iterations isn't enough to fully converge to its final compact cluster shapes.

Alternatively, is it possible to pass in a kwargs to scanpy tools that wrap other algorithms, so that the advanced user can flexibly look up additional MulticoreTSNE parameters to modify, without needing to exhaustively enumerate all parameters in the scanpy wrapper?

Finally, it would be even better to have the faster FFT-based tsne to generalize to millions of cells, the most recent re-implementation being https://github.com/pavlin-policar/openTSNE

In the mean time, one has to overwrite the .X_tsne attribute after running these other tools separately.

ivirshup commented 4 years ago

Adding either more parameters or passing **kwargs arguments to an underlying tone library sounds reasonable. A PR would be welcome 😄

So you don't have to overwrite the "X_tsne" key, you can write the coordinates to whatever key in obsm you want, then call sc.pl.embedding(adata, basis={key})