catalystneuro / neuroconv

Create NWB files by converting and combining neural data in proprietary formats and adding essential metadata.
https://neuroconv.readthedocs.io
BSD 3-Clause "New" or "Revised" License
49 stars 22 forks source link

recasting unit properties as floats #960

Closed bendichter closed 1 week ago

bendichter commented 1 month ago

I am trying to add integer-valued properties to a sortingextractor and convert to NWB, but they are converted to float in the process. I'd really rather write them as integers. It looks like this line is responsible:

https://github.com/catalystneuro/neuroconv/blame/5986a72a2d3fa5a0cfc20b4ad5fa6d3954d096a9/src/neuroconv/tools/spikeinterface/spikeinterface.py#L1169-L1170

I don't see why we would want to automatically convert all integers to floats. @h-mayorquin it looks like you wrote these lines, do you remember why we did this? It's from 2 years ago so fine if not, but I would rather us be able to maintain integer data types where possible.

I am pretty sure I can find another way around this, so it's not blocking me atm, but I think it should be addressed as it can lead to sub-optimal NWB files

bendichter commented 1 month ago

related to #961

CodyCBakerPhD commented 1 month ago

It's to allow appending of several sorting outputs to the same (primary) Units table in the NWB file

If the first, or any other extractors do not share the same unit properties, you need to be able to fill the table with placeholders, and there are no default integers that universally make sense for that role across all cases, so they are upcast to floats to allow NaN to act as the indicator of missing values

bendichter commented 1 month ago

OK, I see how upcasting the float would be useful in that case, but is it common that we need to combine Units tables in this way, and wouldn't want to just create an entirely new table? When has this come up? If we do indeed need this feature, do we really want it to be the default behavior? It would be useful to at least have a flag that allows us to turn this off.

CodyCBakerPhD commented 1 month ago

There can only be one designated 'true' Units table in an NWB file at the canonical position (nwbfile.units); in order to have several, they would all be written to a processing module, which is a capability that we have exposed as a conversion option

This also used to be the norm largely because of NWBWidgets and the PSTH viewer, which only acted on one units+trials table (with no selection control) at a time, so it made things look nicer to have all content combined under the one view

Now that things have shifted to Neurosift, and since Neurosift has the multi-view capability as well as selection control, multiple tables might make more sense now

It would also be more in line with the spirit of design for ndx-extracellular-channels (which separates electrode tables instead of enforcing appending to a single global table, similar to the discussion here)

bendichter commented 1 month ago

OK, great. @h-mayorquin , how do you feel about this shift?

h-mayorquin commented 1 month ago

I think that having the option to not cast is a good one but I am unsure about the default behavior.

It seems that expandable by default is the right choice but on the other hand I am not sure how common the scenario of agregating sorting with different properties really is. Maybe one case where this is used is consensus algorithms but I am not sure if the combined sortings keep the properties.

In the face of uncertainty I suggest a slow approach where we make this option available but keep the current default. This enables users to choose and gives us time to collect more knowledge.

h-mayorquin commented 1 week ago

This is solved by #989