This is the repository for the python part of the conda python package that allows running the results of automated spike sorting algorithms through the t-SNE algorithm obtaining a 2D or 3D embedding of the spikes.
Although this package offers some functionality dedicated to spikesorting the t-SNE part of it is kept separate and can be run with any matrix of samples x features.
The package is split into two parts. The python part (in this repo) has the following functionality:
The C++ part (which generates the Barnes_Hut.exe executable) can be found here.
Firstly the main function calls the python implimented (in numba) GPU part. This uses the GPU to do two things. First to calculate the distances of every sample (spike) to every other sample and then to sort in ascending order the distances from each sample and keep the 3 * perplexity nearest neighbours. These results together with the t-SNE parameters specified by the user are then saved to disk (data.dat file).
Secondly the algorithm calls the Barnes_Hut.exe which operates on this file going over the t-SNE loop (as described by van der Maaten). The results are finally saved to disk.
For a description of the parameters of the t-SNE algorithm see the van der Maatens repository and the scipy implementation.
This t-SNE implementation is significantly improved in speed and input size compared to the standard C++ code or to the scipy implementation. It can operate on datasets of more than 1 million samples. On a Titan X GPU and i7 CPU it solves a 1M samples dataset for 2000 iterations (with theta = 0.4) in about 3 days.
There is no limitation to the required RAM but an increase in sample size will increase the time of the solution in a linear way for the GPU part and in a super linear way (not exponentially though) for the CPU / Barnes hut part.
More detailed documentation can be found in the Github Pages of this repo.