AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
340 stars 150 forks source link

3 VBN sessions with bad channel mask values #2437

Closed corbennett closed 2 years ago

corbennett commented 2 years ago

Describe the bug The following sessions appear to have complete sorting data in LIMS (looking at the metrics.csv file), but the NWBs only show units in cortex (top 3rd of probe). Looking at the associated probe json files, it appears that the problem is the channel mask, which is invalidating the bottom of the probe. It would be good to confirm that these json files reflect what's in the database though, since Wayne updated many of these values before he left.

1044016459: probes A, B, C, D, E, F 1044385384: probes B, C 1086410738: probe E

To Reproduce Note that below I'm using the vbn_2022_dev branch of the SDK to open the nwb for session 1086410738. This one has missing channels for probe E.

from pynwb import NWBHDF5IO
from allensdk.brain_observatory.ecephys.behavior_ecephys_session import (
    BehaviorEcephysSession)
import numpy as np
import pandas as pd

nwb_path = r'\\allen\programs\mindscope\workgroups\np-behavior\vbn_data_release\nwbs_220429\ecephys_session_1086410738.nwb'
with NWBHDF5IO(nwb_path, 'r', load_namespaces=True) as nwb_io:
    session = BehaviorEcephysSession.from_nwb(nwbfile=nwb_io.read())

channels = session.get_channels()
probes = session.probes

for probe in probes.index.values:
    print(probes.loc[probe].description, len(channels.loc[channels['probe_id']==probe]))

which outputs:

probeA 384
probeB 384
probeC 384
probeD 384
probeE 115
probeF 384

Expected behavior I expect all probes to have 384 channels.

danielsf commented 2 years ago

@corbennett

session.get_channels takes a kwarg filter_by_validity that defaults to True, in which case, only channels marked as valid_data are returned. If you call get_channels(filter_by_validity=False), then you see 384 channels for every probe in session 1086410738.

We can change the default behavior, if you would like.

Or is it alarming to you that so many channels are marked as valid_data=False?

Maybe I am just telling you what you already know? Is your question really "why do so many channels have valid_data=False?"

danielsf commented 2 years ago

Querying LIMS with queries like

    SELECT
      ecephys_probes.id as probe_id,
      ecephys_probes.name as probe_name,
      ecephys_channels.id as channel_id,
      ecephys_channels.valid_data as channel_validity
    FROM
      ecephys_channels
    JOIN
      ecephys_probes ON
          ecephys_channels.ecephys_probe_id=ecephys_probes.id
    JOIN
      ecephys_sessions ON
          ecephys_probes.ecephys_session_id=ecephys_sessions.id
    WHERE
      ecephys_sessions.id=1086410738

and then counting the total number of probes and the number of probes with valid_data=True, I find

session 1086410738
probeA: 384 total; 384 valid
probeB: 384 total; 384 valid
probeC: 384 total; 384 valid
probeD: 384 total; 384 valid
probeE: 384 total; 115 valid
probeF: 384 total; 384 valid
session 1044016459
probeA: 384 total; 78 valid
probeB: 384 total; 78 valid
probeC: 384 total; 78 valid
probeD: 384 total; 77 valid
probeE: 384 total; 78 valid
probeF: 384 total; 78 valid
session 1044385384
probeA: 384 total; 78 valid
probeB: 384 total; 78 valid
probeC: 384 total; 78 valid
probeD: 384 total; 373 valid
probeE: 384 total; 356 valid
probeF: 384 total; 373 valid

So: it looks like, however those probe JSON files got set, they do reflect what is in the database.

corbennett commented 2 years ago

@danielsf Yep looks like the original probe jsons were uploaded with incorrect masks. For these sessions, we can just set the valid data flag to true for all channels EXCEPT index 191 (the 192nd channel).

danielsf commented 2 years ago

Zeb has run the SQL update script. gave him

    SELECT
      ecephys_probes.id as probe_id,
      ecephys_probes.name as probe_name,
      ecephys_channels.id as channel_id,
      ecephys_channels.valid_data as channel_validity,
      ecephys_channels.local_index as local_index
    FROM
      ecephys_channels
    JOIN
      ecephys_probes ON
          ecephys_channels.ecephys_probe_id=ecephys_probes.id
    JOIN
      ecephys_sessions ON
          ecephys_probes.ecephys_session_id=ecephys_sessions.id
    WHERE
      ecephys_sessions.id=1044385384

session 1044385384
probeA: 384 total; 383 valid
probeB: 384 total; 383 valid
probeC: 384 total; 383 valid
probeD: 384 total; 383 valid
probeE: 384 total; 383 valid
probeF: 384 total; 383 valid

So: this ticket should be ready to close.

danielsf commented 2 years ago

The data fix that was run is

UPDATE ecephys_channels
SET valid_data=True
WHERE id IN
(
  SELECT ecephys_channels.id
  FROM ecephys_channels
  JOIN ecephys_probes ON
    ecephys_channels.ecephys_probe_id=ecephys_probes.id
  JOIN ecephys_sessions ON
    ecephys_probes.ecephys_session_id=ecephys_sessions.id
  WHERE
    ecephys_sessions.id IN (1043752325, 1044016459, 1044385384, 1044389060, 1044594870, 1044597824, 1046166369, 1046581736, 1047969464, 1047977240, 1048189115, 1048196054, 1049273528, 1049514117, 1051155866, 1052331749, 1052342277, 1052530003, 1052533639, 1053709239, 1053718935, 1053925378, 1053941483, 1055221968, 1055240613, 1055403683, 1055415082, 1056495334, 1056720277, 1059678195, 1059908979, 1061238668, 1061463555, 1062755416, 1062755779, 1063010385, 1063010496, 1064400234, 1064415305, 1064639378, 1064644573, 1065437523, 1065449881, 1065905010, 1065908084, 1067588044, 1067781390, 1067790400, 1069192277, 1069193611, 1069458330, 1069461581, 1070961372, 1071300149, 1072341440, 1072345110, 1072567062, 1072572100, 1076265417, 1076487758, 1077711823, 1077712208, 1077897245, 1079018622, 1079018673, 1079275221, 1079278078, 1081079981, 1081090969, 1081429294, 1081431006, 1084427055, 1084428217, 1084939136, 1086198651, 1086200042, 1086410738, 1086433081, 1087720624, 1087723305, 1087992708, 1087993643, 1089296550, 1090800639, 1090803859, 1091039376, 1091039902, 1092283837, 1092466205, 1093638203, 1093642839, 1093864136, 1093867806, 1095138995, 1095340643, 1096620314, 1096935816, 1098119201, 1098350754, 1099596266, 1099598937, 1099869737, 1099872628, 1101263832, 1101268690, 1101473342, 1104052767, 1104058216, 1104289498, 1104297538, 1105543760, 1105798776, 1106985031, 1107172157, 1108334384, 1108335514, 1108528422, 1108531612, 1109680280, 1109889304, 1111013640, 1111216934, 1112302803, 1112515874, 1113751921, 1113957627, 1115077618, 1115086689, 1115356973, 1115368723, 1116941914, 1117148442, 1118324999, 1118327332, 1118508667, 1118512505, 1119946360, 1120251466, 1121406444, 1121607504, 1122903357, 1123100019, 1124285719, 1124507277, 1125713722, 1125937457, 1128520325, 1128719842, 1130113579, 1130349290, 1139846596, 1140102579, 1152632711, 1152811536)
);

UPDATE ecephys_channels
SET valid_data=False
WHERE id IN
(
  SELECT ecephys_channels.id
  FROM ecephys_channels
  JOIN ecephys_probes ON
    ecephys_channels.ecephys_probe_id=ecephys_probes.id
  JOIN ecephys_sessions ON
    ecephys_probes.ecephys_session_id=ecephys_sessions.id
  WHERE
    ecephys_sessions.id IN (1043752325, 1044016459, 1044385384, 1044389060, 1044594870, 1044597824, 1046166369, 1046581736, 1047969464, 1047977240, 1048189115, 1048196054, 1049273528, 1049514117, 1051155866, 1052331749, 1052342277, 1052530003, 1052533639, 1053709239, 1053718935, 1053925378, 1053941483, 1055221968, 1055240613, 1055403683, 1055415082, 1056495334, 1056720277, 1059678195, 1059908979, 1061238668, 1061463555, 1062755416, 1062755779, 1063010385, 1063010496, 1064400234, 1064415305, 1064639378, 1064644573, 1065437523, 1065449881, 1065905010, 1065908084, 1067588044, 1067781390, 1067790400, 1069192277, 1069193611, 1069458330, 1069461581, 1070961372, 1071300149, 1072341440, 1072345110, 1072567062, 1072572100, 1076265417, 1076487758, 1077711823, 1077712208, 1077897245, 1079018622, 1079018673, 1079275221, 1079278078, 1081079981, 1081090969, 1081429294, 1081431006, 1084427055, 1084428217, 1084939136, 1086198651, 1086200042, 1086410738, 1086433081, 1087720624, 1087723305, 1087992708, 1087993643, 1089296550, 1090800639, 1090803859, 1091039376, 1091039902, 1092283837, 1092466205, 1093638203, 1093642839, 1093864136, 1093867806, 1095138995, 1095340643, 1096620314, 1096935816, 1098119201, 1098350754, 1099596266, 1099598937, 1099869737, 1099872628, 1101263832, 1101268690, 1101473342, 1104052767, 1104058216, 1104289498, 1104297538, 1105543760, 1105798776, 1106985031, 1107172157, 1108334384, 1108335514, 1108528422, 1108531612, 1109680280, 1109889304, 1111013640, 1111216934, 1112302803, 1112515874, 1113751921, 1113957627, 1115077618, 1115086689, 1115356973, 1115368723, 1116941914, 1117148442, 1118324999, 1118327332, 1118508667, 1118512505, 1119946360, 1120251466, 1121406444, 1121607504, 1122903357, 1123100019, 1124285719, 1124507277, 1125713722, 1125937457, 1128520325, 1128719842, 1130113579, 1130349290, 1139846596, 1140102579, 1152632711, 1152811536)
  AND
    ecephys_channels.local_index=191
);