AllenInstitute / npc_sessions

Tools for accessing and packaging data from behavior and epyhys sessions from the Mindscope Neuropixels team, in the cloud.
1 stars 1 forks source link

Delete raw assets for Same Probe Different Record Node #107

Open arjunsridhar12345 opened 2 months ago

arjunsridhar12345 commented 2 months ago

Sessions:

bjhardcastle commented 2 months ago

here's a script to find sessions with duplicate probes based on the current data on-site:

import pathlib
import npc_session

ROOT = pathlib.Path('//allen/programs/mindscope/workgroups/dynamicrouting/PilotEphys/Task 2 pilot')

for folder in ROOT.iterdir():
    if '_366122_' in folder.name:
        continue
    # if '702136_20240306' not in folder.name: # for testing - has duplicate probe F
    #     continue
    if not folder.is_dir():
        continue
    dat_files = list(folder.rglob('**/continuous/**/*Probe*-AP/continuous.dat'))
    if not dat_files:
        continue
    # assumption is 2 record nodes, one probe per 2 record nodes
    num_record_nodes = len(list((f.name for f in folder.rglob('Record Node *') if f.is_dir())))
    if num_record_nodes % 2 != 0:
        print(f"Odd number of record nodes: {num_record_nodes} - {folder}")
        continue
    # there may be multiple experiment folders per record node (therefore 2 per probe)
    # there may be multiple recording folders per experiment (therefore 2 per probe)
    num_experiments: int = len(list((f.name for f in folder.rglob('experiment[0-9]*') if f.is_dir())))
    num_recordings = len(list(f.name for f in folder.rglob('recording[0-9]*') if f.is_dir()))
    probes = list(npc_session.ProbeRecord(dat.as_posix()) for dat in dat_files)
    for p in set(probes):
        if probes.count(p) > (num_recordings / 2) * (num_experiments / 2):
            print(f"Duplicate {p.name} - {folder}")