SWAN (Sequential Waveform Analyzer) is an open-source graphical tool being developed at the Institute of Neuroscience and Medicine - 6 (INM-6) of the Forschungszentrum Jülich, for tracking single units across multiple sessions of electrophysiological data that was recorded using chronically implanted microelectrode arrays. The tool is written in Python 3 and PyQt5, and makes use of efficient libraries for plotting (pyqtgraph), number crunching (numpy), data I/O and storage (pickle, neo). The Graphical User Interface (GUI) is modularly organized into multiple views, each showing a different aspect of the loaded datasets.
The tool is currently used to analyze reach-to-grasp datasets which were recorded using the Utah Microelectrode Array. Thus, in its current state, it works with datasets stored in the proprietary .nev
and .ns6
file formats provided by Blackrock Microsystems. However, the tool makes use of neo
's file I/O capabilities and data storage structures, making it easy to extend its functionality to datasets recorded using other systems.
In extracellular electrophysiological experiments, the population activity of neurons is obtained in the form of time-ordered voltage measurements recorded using probes inserted into brain tissue. From this population activity, the activity of putative single neurons (or, single units) can be isolated by the process of spike sorting.
In chronic electrophysiolgical experiments recorded using microelectrode arrays, the implanted electrodes of the array remain in a fixed position over long periods of time, enabling the long-term measurement of the population of neurons surrounding the electrodes in the brain tissue. Typically, each experimental session is spike sorted independently, resulting in a large number of single units over the course of the experiment. However, given that the electrodes are recording chronically from the same population of neurons, one might expect many of these seemingly independent single units to arise from the activity of the same neuron in different experimental sessions.
A single unit consists of the time stamps when the unit fired, termed the spike times, and the corresponding snippets of the voltage signal showing the characteristic response of the neuron, termed the waveforms. Together, these can be used to compare and identify single units which might correspond to the same neuron. This is also where SWAN gets its name from - it helps with the sequential analysis of spike waveforms.
SWAN is designed to help you identify such single units across multiple spike sorted datasets. It extracts different features of neurons across datasets -- namely, the mean waveform of the single unit, the inter-spike interval histogram of the corresponding spike train, and the event-triggered rate profiles -- and displays them in an intuitive fashion. It then allows the user to assign global unit IDs (GIDs) to each single unit so that all units that putatively arise from the same neuron can be identified by on GID.
SWAN currently works with .nev
files recorded using equipment from Blackrock Microsystems. The spike sorting results of all desired sessions must be stored in separate .nev
files, with each file containing the time stamps and waveforms of each single unit. Each data file is read in using the BlackrockIO class of Neo, thus, preserving any existing names or description given to units during the process of spike sorting.
In the current implementation, all units with the words "noise" or "unclassifed" in the description are not loaded and are excluded from the analysis. Depending on different use cases, this could be modified in future releases.
SWAN is currently not available on either conda or PyPI, and must be installed from the source. A PyPI package is in the works.
We recommend installing SWAN in a conda or virtualenv environment. The installation itself is carried out from the source directory using pip.
conda create -n swan python=3
conda activate swan
pip install python-swan
virtualenv venv
source activate venv/bin/activate
pip install python-swan
Once installed, start SWAN using
swan /path/to/temp/
where /path/to/temp/
is where you want store cache files generated while using SWAN. When no argument is provided, it defaults to the output of tempfile.gettempdir()
from the tempfile package. Note: on a Windows machine, the path argument should look something like C:\path\to\temp
.
To begin using SWAN, click on File -> New Project...
or click on the "New Project" icon . This will open the following dialog.
Use the "Browse..." button to choose the path of folder where your data files are stored. All loadable files will be shown in the list on the left by their filenames (without the extension). Select the ones you wish to load and click "Add" to move them to the list on the right. Data files from multiple locations can be loaded using this method. Once all required files have been selected, click on "OK" to load all datasets. By default, the first channel of all selected datasets will be loaded. A temporary project file will be created in the temporary directory specified when launching SWAN.
At any point in time, you can save any changes you have made to a project by clicking File -> Save Project As...
or clicking the icon in the toolbar and choosing a location and name to save the project file. In case you want save over the current project file, click on File -> Save Project
or the icon in the toolbar.
If you wish to continue working on a previously saved project, click on File -> Load Project...
or the icon in the toolbar, and choose the .txt
file which corresponds to your desired project file. Keep in mind that the corresponding .vum
must also exist in the same location as the .txt
. file.
The interface of SWAN is organized into five dockable widgets, called views, each showing a specific aspect of the loaded data.
To determine which curve/point in the different views correspond to which unit in the Plot Grid, you can click on either the cell of the unit or on any of the curves (in the Mean Waveforms View, ISI Histograms View, Rate Profiles View) or the point corresponding to the mean waveform in the PCA View. This will highlight the curves/points corresponding to that unit across all views. Clicking again will remove the highlight. This can be used to highlight units across all views.
The most fundamental functionality of SWAN lies in its ability to reassign GIDs to the single units shown in the Plot Grid. This is achieved by swapping the contents of two cells in the grid. After selecting one cell in a certain column, a second cell in the same column can be selected. Once two cells are selected, the contents of the cell can be swapped by clicking on the Edit -> Swap
or the icon in the toolbar. If both cells contain a unit, then the GIDs of the units are swapped (along with their colours). If one of the cells is empty, then the unit contained in the other cell is assigned to the empty cell, thereby assigning it a new GID.
This process can be repeated until the desired mapping of the original units to the GIDs is achieved. The goal is to assign one GID to all units which putatively represent the same neuron.
At some point while using SWAN, you might encounter the need to "turn off" certain global unit IDs (rows), or certain sessions (columns), so as to be able to visualize specific units better. This is possible by clicking the first cell in the row or column which needs to be deactivated. This will grey out that row/column, and all single units on that row or column will no longer be visible in any of the other views. To re-activate the row/column, click the first cell in the row/column again.
Note: the PCA view uses the session with the largest number of units to calculate the principal components. By deactivating certain units, you might change the session chosen to calculate these principal components and the PCA view may not remain the same.
Although manually swapping units is a simple and intuitive way of arriving at the final mapping of units to GIDs, it can get tedious and time consuming when a large number of sessions are loaded. We provide the possibility of automatizing this process with the help of algorithms. Two algorithms have been implemented, accessible by clicking Edit -> Recalculate Mapping...
. A dialog appears giving a choice between the following two algorithms.
Euclidean-distance based waveform comparison (Old Implementation) - the Euclidean distance between the mean waveforms of all units of each pair of consecutive sessions is calculated, and those units with highly similar waveforms are assigned the same GID. This process is repeated for each pair of consecutive sessions. This method is provided for as legacy, however, it is no longer maintained.
K-means++ clustering in high-dimentional feature space (SWAN Implementation)- a customizable set of features is used to build a feature vector for each unit. Then these units are clustered in a high-dimensional feature space using the K-means++ algorithm. Before the clustering, the expected number of clusters must be provided as input for the clustering algorithm. Since it is not possible to assign the same GID to two units from the same session, the algorithm resolves such conflicts after the clustering step by assigning fresh GIDs to conflicting units. This is the recommended algorithm for SWAN.
Once the final mapping of units to GIDs is obtained, it can be exported to either a CSV format, or an OdML format. Both these options can be found under the File
menu.
Please feel free to file any bug reports or submit pull requests for SWAN. We're also happy to hear about suggestions and feature requests.
SWAN is currently being maintained by Shashwat Sridhar. The original version of SWAN was developed by Christoph Gollan.