nsteinme / steinmetz-et-al-2019

Code accompanying Steinmetz et al., 2019
GNU General Public License v3.0
55 stars 14 forks source link

Incorrect units in some metadata fields #8

Closed AngCamp closed 3 months ago

AngCamp commented 1 year ago

Some of the units in the dataset are reported incorrectly or need scaling. In particular the waveformDurations are listed as being in seconds but clearly this cannot be the case since there would be waveform durations lasting 50 or 3000 seconds in some cases. Tatum_2017-12-06 has clusters with waveform durations in the thousands of seconds. In Muller_2017-10-29 by contrast the waveform durations are more reasonable, but again it seems they have been stored in samples which based on the LFP data here should be 0.0004s (1s/2500Hz). One other issue is that the LFP needs rescaling but this is not noted anywhere, the Allen Brain Atlas dataset for instance recommends a scaling factor of 0.195 and subtracting the median for each channel. This post on the community forum has useful information forusing the raw LFP, not sure if its advice would apply to this dataset as well

Overall the metadata and information is amazing in this repo but there are small bits of important missing information which make its reuse challenging unless one is deeply familiar with neuropixels data already.

nsteinme commented 1 year ago

Thanks for these clarifications. The data were acquired with spikeglx so the LFP files contain the raw bits - that means the scaling factor is 4.69 uV/bit - that's a gain setting of 250 (the default) during the acquisition. https://github.com/cortex-lab/neuropixels/wiki/Gain_settings

I'm not immediately sure about the waveform durations but you can always calculate these yourself from the waveforms. Note that LFP is sampled at 2500Hz but AP-band data is sampled at 30kHz.

AngCamp commented 1 year ago

Thank you for the clarification.

AngCamp commented 1 year ago

I'm a bit unclear about the waveform templates as well actually. When I looked at data from 'Tatum_2017-12-06' the range of the waveform durations was 23 to 3336 "units", even after filtering out waveform durations from low quality clusters. By contrast Muller_2017-01-09, and Cori_2016-12-17 have waveforms that go from ~4 to 50 "units", with a median around ~20. It makes me wonder if in Tatum_2017-12-06 the halfwidths (waveform durations) were reported at the sampling from the AP frequency and in others it was in the LFP sampling frequency. Is that possible? Or is Tatum_2017-12-06 just a bad recording?