Info / requests for phy-like features

jonahpearl commented 1 month ago

Hi Jeremy — this is a great project, and I really appreciate the close integration with spikeinterface, which I've been trying out for a little bit now. I do most of my spike sorting on a computing cluster, so using sortingview instead of phy would be a big time saver! I played with it a bit this morning, and have a few questions regarding some phy-like features that I was looking for but didn't find. I'm wondering if 1) I've missed them, 2) there's a way to get them but the default SI implementation doesn't have them, or 3) they aren't in the app yet. If 3), then consider this a feature request to make sortingview a fully functioning replacement for phy :)

(In order of subjectively determined importance...)

When I click on a unit in the unit table, the waveform view doesn't update. All the other views do seem to update.
Are there keyboard shortcuts? E.g. for labeling a unit during curation, moving to the next unit on the list, etc.
Is there a way to split units? So far, all I see is the option to merge. [Edit: found #232 — why no plans for this? It happens all the time that one would need to split units, eg see screenshot]
- Ideally, this would be possible in the amplitude view, as well as a PC's view — side note, is there a PC's view?
- In the amplitude view, I can select units across time (if I shift-click, I can draw a box on the axes), but I can't seem to select specific amplitude ranges / draw arbitrary boxes as you can in Phy. This feature would probably be important for splitting as in Phy.
Some of the colors used in the waveform plotting are very close in luminance to the gray of the std shading, and it makes it unnecessarily hard to see the waveforms (see screenshot 2 at bottom).
[This is partly a spikeinterface question] I'd like to be able to have all the typical quality metrics in the unit table (snr, amplitude, isi violation ratio, etc), and at least be able to sort the rows by those, if not filter like in Phy. The default SI implementation doesn't seem to add any quality metrics, and neither of their args unit_table_properties or label_choices seem to do the job. I think I see how to add these in SI, and will work on a PR, but filtering by those like in Phy seems like a sortingview question.

So to summarize, if I could 1) quickly view the waveform templates for each unit I selected, 2) work faster with keyboard shortcuts, and 3) split units via amplitude / PC view, I would consider this a working replacement for Phy. What with copilot being so good these days, I'm tempted to try building some of these myself, but I'd like to know if they're in progress / have been deemed impossible due to technical quirks / etc.

Thank you!

magland commented 1 month ago

Hi @jonahpearl thanks for reaching out with the detailed feature requests! I had no plans of splitting clusters, but maybe you'll convince me, who knows.

I'm mostly unavailable for the rest of the week, but maybe we can have a zoom call next week. Feel free to reach out by email if you can find it. :)

alejoe91 commented 1 month ago

I'll copy my comment here:

@jonahpearl let me chime in here, since we discussed about this several times.

Sortingview works by pushing data to the cloud. To do so and make things efficient (and cheap), we have to minimize the amount of data used for visualization. As an example, the amplitude scatterplots use a decimated version of the amplitudes and of the spike times (e.g., 1 out of 10). In order to split, you need the full array of all the data you're plotting, because the "split" will need the indices of every spike that belong to each splitted cluster. Another example is that you don't have a Waveform view, but just a templates view!

So I think it could be an option to add, but in that case we should expect a clear slowdown in performance and increase in storage costs.

In addition, we recently added a curation format that we will need to extend for splits. I think we should do this anyways because we will implement splitting in the SpikeInterface-GUI, which works directly off the sorting analyzer.

jonahpearl commented 1 month ago

Thanks AB! Those limitations make sense. I'm also sympathetic to not wanting to develop two different curation GUIs in parallel (ie SpikeInterface-GUI and sortingview), and to the fact that probably more people are doing things on a local desktop and will be fine with a local app — though I think there's nothing keeping local users from using a well-designed web-app.

With the caveat that I know very little about web development, those limitations don't seem fatal. I see that sortingview is powered by kachery / a "figure sharing" aesthetic, wherein you can immortalize views of your data to be shared with people, and that makes sense to upload small + permanent datasets. For spike curation, though, why not either 1) run a local server more akin to Jupyter, 2) upload more data to the cloud, with the caveat that it might take a few minutes, and impose a deletion timer, e.g. 7 days, after which the user would need to re-upload the data, or 3) ask users to set up their own storage instead of using whatever the current default is. Given the bugs people are willing to wade through with Phy, I suspect they'd be willing to do some set up for a bug-free version :)

Re performance, maybe there could be an option about what fraction of the data to be showing, and the trade-off would be performance vs. completeness. Then you could, e.g. have an idea for a split, double-check it by looking at the full dataset (or at least more of it), do it, and then revert back.

magland / sortingview

Info / requests for phy-like features #233