Closed rly closed 1 year ago
Right now, the authors of the extension are listed as Ryan Ly, Oliver Ruebel, and Kay Robbins, and uses our institutional email addresses. I'm happy to change/add to the names, email addresses, order, etc. The extension uses the BSD-3 license. Happy to change that to another permissive license also.
HedNWBFile
- a subtype ofNWBFile
that adds an optional attribute that stores the HED version schema
@rly could you clarify why hed_schema_version
should sit globally at the NWBFile level, rather than being an attribute on the HedTags
column?
If there is indeed additional global metadata that we need to store for HED
(e.g., the version), then I would suggest extending LabMetaData
to add a group /general/hed
to the metada
We want only one HED version per data file.... I think it should be global to that file.
On Mon, Oct 23, 2023 at 12:30 PM Oliver Ruebel @.***> wrote:
- HedNWBFile - a subtype of NWBFile that adds an optional attribute that stores the HED version schema
@rly https://github.com/rly could you clarify why hed_schema_version should sit globally at the NWBFile level, rather than being an attribute on the HedTags column?
If there is indeed additional global metadata that we need to store for HED (e.g., the version), then I would suggest extending LabMetaData to add a group /general/hed to the metada
— Reply to this email directly, view it on GitHub https://github.com/hed-standard/ndx-hed/pull/3#issuecomment-1775680857, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJCJOR5O4Q4GPXXSNLRUOLYA2SUXAVCNFSM6AAAAAA6LKSM2OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZVGY4DAOBVG4 . You are receiving this because your review was requested.Message ID: @.***>
Actually, we want only one version of HED for the entire dataset (Dandi set?)
On Mon, Oct 23, 2023 at 12:39 PM Kay Robbins @.***> wrote:
We want only one HED version per data file.... I think it should be global to that file.
On Mon, Oct 23, 2023 at 12:30 PM Oliver Ruebel @.***> wrote:
- HedNWBFile - a subtype of NWBFile that adds an optional attribute that stores the HED version schema
@rly https://github.com/rly could you clarify why hed_schema_version should sit globally at the NWBFile level, rather than being an attribute on the HedTags column?
If there is indeed additional global metadata that we need to store for HED (e.g., the version), then I would suggest extending LabMetaData to add a group /general/hed to the metada
— Reply to this email directly, view it on GitHub https://github.com/hed-standard/ndx-hed/pull/3#issuecomment-1775680857, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJCJOR5O4Q4GPXXSNLRUOLYA2SUXAVCNFSM6AAAAAA6LKSM2OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZVGY4DAOBVG4 . You are receiving this because your review was requested.Message ID: @.***>
Actually, we want only one version of HED for the entire dataset (Dandi set?)
I don't think that is something we can enforce in the extension. We only have control of the schema of individual files but not of the collection.
Thanks for the feedback!
Yes, I am currently modeling the HED column as an array of freeform string tags. This could include strings like "(Intended-effect, Cue)"
but I think I understand now that "Intended-effect" and "Cue" are HED tags and this is a nested tag.
Would it be preferred to store a single string for a HED annotation that is a comma-separated list of HED tags? To be explicit, we would change the usage to:
nwbfile.add_trial_column("HED", "HED annotations for each trial", col_cls=HedAnnotations)
nwbfile.add_trial(start_time=0.0, stop_time=1.0, HED="animal_target, correct_response, (Intended-effect, Cue)")
# ...
nwbfile.trials["HED"][0] # returns "animal_target, correct_response, (Intended-effect, Cue)"
I will move the hed_schema_version
to a new group HedMetadata
with the name hed
that extends LabMetaData
so that it can be placed at /general/hed
. This group will have only one attribute, hed_schema_version
.
I will add Ian Callanan to the authors list.
It looks like you are modeling the HED column as an array of string tags. Usually the HED annotation is a single string. We could deal with an array by just joining, but the items will need to represent not just single tags to allow parentheses. Yes that is correct.
Can we enforce that if a HED column will be called "HED"?
Yes, you can set name: HED
in the schema to enforce a fixed name. However, if the name is fixed this means that you will only be able to have one HED annotation dataset in a particular table if the name is fixed. Alternatively, you could also set default_name: HED
, which would allow the user to change the name but by default the name would be HED
. I think in the case of HED it is probably Ok to limit to one of these columns per table, so fixing the name
may be Ok.
Even when the name is fixed, however, I would still recommend to check for the neurodata_type
(i.e., here HedAnnotation
) rather than for the name
of a dataset to identify the type (i.e., if a column stores HED tags). A user can add custom columns to tables and as such a user could name a column HED, even it is not a HedAnnotation
column with HED tags (i.e., we can enforce only that the column from this schema must be named HED
but we cannot enforce that other types of columns cannot be named HED
).
Even when the name is fixed, however, I would still recommend to check for the
neurodata_type
(i.e., hereHedAnnotation
) rather than for thename
of a dataset to identify the type (i.e., if a column stores HED tags). A user can add custom columns to tables and as such a user could name a column HED, even it is not aHedAnnotation
column with HED tags (i.e., we can enforce only that the column from this schema must be namedHED
but we cannot enforce that other types of columns cannot be namedHED
).
Yes good idea...
Resolves #1. This PR adds 2 new neurodata types:
HedTags
- a subtype ofVectorData
that can be used to represent a column of string HED tags in anyDynamicTable
objectHedNWBFile
- a subtype ofNWBFile
that adds an optional attribute that stores the HED version schema (string) at the file level.Example usage:
This does not yet use the
Events
tables in the ndx-events extension. We are in the process of updating those neurodata types. I'll add and demonstrate support for those types in a separate PR.I've commented out two integration tests that do not pass because, in writing them, I found a bug in one of the testing utilities of pynwb. I have a bugfix for that in pynwb here: https://github.com/NeurodataWithoutBorders/pynwb/pull/1782 . After that is merged and PyNWB 2.6.0 is released (hopefully this week), we can uncomment those tests.