NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
174 stars 85 forks source link

[Bug]: `epoch_tags` not written to disk #1901

Closed bjhardcastle closed 1 week ago

bjhardcastle commented 2 months ago

What happened?

epoch_tags can be assigned to an object, but aren't written to disk correctly.

Steps to Reproduce

import pynwb
import datetime

nwb = pynwb.NWBFile(
    session_id='test',
    session_description='test',
    identifier='12345',
    session_start_time=(
        datetime.datetime.now()
    ),
    epoch_tags={'tag1', 'tag2'},
)
print(f'memory: {nwb.epoch_tags = }')

path = 'test1.nwb'

with pynwb.NWBHDF5IO(path, 'w') as io:
    io.write(
        nwb, link_data=False
    )

with pynwb.NWBHDF5IO(path, 'r') as io:
    print(f'disk: {io.read().epoch_tags}')
memory: nwb.epoch_tags = {'tag1', 'tag2'}
disk: set()

Traceback

No response

Operating System

Windows

Python Executable

Python

Python Version

3.11

Package Versions

environment_for_issue.txt

Code of Conduct

stephprince commented 2 months ago

Thanks for reporting this bug! I was able to replicate that epoch_tags is not getting written to disk. It looks like this might be because the epoch_tags nwbfile attribute is not specified in the schema. We will take a look at this.

As a potential workaround, if you want to access tags that were added to the epochs table, you should be able to access them like so:

import pynwb
import datetime
from pynwb import NWBHDF5IO

nwb = pynwb.NWBFile(
    session_id='test',
    session_description='test',
    identifier='12345',
    session_start_time=(
        datetime.datetime.now()
    ),
)
nwb.add_epoch(start_time=1.0, stop_time=10.0, tags=["tag1", "tag2"])
nwb.add_epoch(start_time=10.0, stop_time=20.0, tags=["tag1", "tag3"])
print(f'memory: {nwb.epoch_tags}')

path = 'test1.nwb'
with pynwb.NWBHDF5IO(path, 'w') as io:
    io.write(nwb, link_data=False)

with pynwb.NWBHDF5IO(path, 'r') as io:
    nwb_read = io.read()
    epoch_tags = set(tag for epoch in nwb_read.epochs[:, "tags"] for tag in epoch)
    print(f'disk: {epoch_tags}')

Which returns:

memory: {'tag1', 'tag3', 'tag2'}
disk: {'tag1', 'tag3', 'tag2'}

Would something like that work for your purposes?

bjhardcastle commented 2 months ago

Thanks, yes that's how I'd create my sequence of tags for adding to the NWBFile instance anyway. I mainly wanted them to appear correctly in the html repr displayed in a notebook.

If they're not in the schema already, how about just removing the ability to assign them altogether? It does seem redundant...

NWBFile.epoch_tags could just be a property getter that returns the tags on-demand from the epochs table. Then they're always in sync and duplicate information won't be stored to disk.