Open weiglszonja opened 6 months ago
Why should we keep the order?
@h-mayorquin Because it doesn't look as good as it could in Neurosift
In Szonja's example of : ['electrode_type', 'acx_x', 'chamber_type', 'chamber_z', 'channel_ids', 'acx_z', 'chamber_x', 'acx_y', 'chamber_y', 'offset_to_uV', 'gain_to_uV']
, wouldn't it look much better if the acx_
were kept together?
(BTW @weiglszonja I'd expand the acx
to something more readable - I'm assuming this is acceleration
?)
If that's the case maybe an easier solution is to have a predetermined order for the special ones (like location) and then write the rest of them with natsort.
I Just thinking that bounding ourselves to followg spikeinterface order will bring us problems. Plus, it is even more problematic when you consider more than one recording object.
@h-mayorquin See discussion on related PR for proposed approach: https://github.com/catalystneuro/neuroconv/pull/793
(BTW @weiglszonja I'd expand the
acx
to something more readable - I'm assuming this isacceleration
?)
I'm following the lab's notation here, acx_
are the coordinates relative to the crossing of anterior commissure (I added a description so it's clear when looking at the table).
Plus, it is even more problematic when you consider more than one recording object.
Yes, I'm not sure how to solve that one in https://github.com/catalystneuro/neuroconv/pull/793 yet...
Yes, I'm not sure how to solve that one in https://github.com/catalystneuro/neuroconv/pull/793 yet...
In principle the approach outlined in https://github.com/catalystneuro/neuroconv/pull/793#issuecomment-2025843301 should handle it because the metadata columns are already global across interfaces
My first intuition is that the complexity is not worth the effort. For producing better looking output I think a hard-coded rule for some properties and then sorting the rest should ease most of the ugliness.
I am strongly against:
df[an_order_of_the_columns]
and df[another_order_of_the_columns]
gives users of the electrode table all the flexibility of presentation they need.Did you read the PR conversation?
Using the already existing metadata["Ecephys"]["Electrodes"]
list isn't adding any complexity - all those columns will already be written - and it's already a List[dict]
so why not just rely on that order as a loose mechanism for granting priority when constructing the table? It's also already global. It also abstracts from the concept of what was automatically added vs. added in a custom fashion
This is also a feature that could play a role in the GUIDE, which allows even easier customization of electrode table content than we've ever had
Yes, I did, I had this proposal in mind:
Using the already existing metadata["Ecephys"]["Electrodes"] list isn't adding any complexity - all those columns will already be written - and it's already a List[dict] so why not just rely on that order as a loose mechanism for granting priority when constructing the table? It's also already global. It also abstracts from the concept of what was automatically added vs. added in a custom fashion
The complexity added would be the modifcation to make the rest of the function to actually work with that. Then documenting and testing the different scenarios (multiple recordings, recordings that match only in some channels, etc) and the error handling. You already mentioned that all the properties should be there.
Probably our difference here is that I Just don't see much value on ordering columns. The example you gave seems like a minor aesthethical thing that can be fixed with two lines (one for defining the order of the special properties) and one for sorting the rest. What would further customization be used for?
natsort
is not magical, and does not grant latent precedence to items that might be considered 'more important' or more 'semantically related'
This can be documented with a single line statement: "The order of entries in the ["Ecephys"]["Electrodes"]
list provides a precedence for column order in the final electrodes table."
More tests is a good thing, might even catch other issues with tangentially related things
Which is why this feature wasn't in the code base before now, but is requested now, so we're adding it now
natsort is not magical, and does not grant latent precedence to items that might be considered 'more important' or more 'semantically related'
I think it would be more fruitful if we had some concrete use cases on mind. The example that you mentioned above would be work with natsort.
I am coming from the prior that determining the order of columns is something that does not add too much. But if the code, tests and documentations are that simple I guess the implementation PR should showcase it.
How would you grant configurability to other tables? Units for example?
Just to be clear, is @weiglszonja use case the same as yours? To make the electrode_table look better?
I noticed the order of properties that are being added to the
Electrodes
table is not in the same order as in the recording_extractor._properties, for instance:The order of properties to add to the Electrodes table is changed at this line: https://github.com/catalystneuro/neuroconv/blob/fa6edd532f6bb6ead42ae40aa9029c7c78eead5f/src/neuroconv/tools/spikeinterface/spikeinterface.py#L322 which does not preserve the order of the elements:
I think we should preserve the order of properties to be the same the
recording_extractor._properties
.Opened https://github.com/catalystneuro/neuroconv/pull/793 as draft to discuss