hed-standard / ndx-hed

HED extensions for Neurodata Without Borders (nwb)
BSD 3-Clause "New" or "Revised" License
0 stars 2 forks source link

Did some renaming of classes #4

Closed VisLab closed 11 months ago

VisLab commented 1 year ago

@rly

  1. I changed HedMetaData to HedVersion

  2. The HedVersion type could be a list of schema versions if multiple schema libraries are used. How do you change the specification to a list of any number of strings? If this is not possible, we could specify the list as a string and parse the list after.

  3. We would like to specify that the column name is HED. I was not clear whether the name field was sufficient or whether it needed the dim attribute? The test is failing so it needs to be changed.

Also:

In the template: create_extension_spec.py, it refers to the link: https://pynwb.readthedocs.io/en/latest/extensions.html#extending-nwb

which is broken.

VisLab commented 1 year ago

@rly this is further along but I have many questions --- which are marked to TODO in the code. In particular,

Thx!

rly commented 12 months ago

@rly

  1. I changed HedMetaData to HedVersion

👍

  1. The HedVersion type could be a list of schema versions if multiple schema libraries are used. How do you change the specification to a list of any number of strings? If this is not possible, we could specify the list as a string and parse the list after.

In new line 65 of create_extension_spec.py, in the definition of the AttributeSpec with name: hed_schema_version, add

shape=(None, ),
dims=("n_hed_schema_versions",)
  1. We would like to specify that the column name is HED. I was not clear whether the name field was sufficient or whether it needed the dim attribute? The test is failing so it needs to be changed.

dims is not required.

Also:

In the template: create_extension_spec.py, it refers to the link: pynwb.readthedocs.io/en/latest/extensions.html#extending-nwb

which is broken.

Thanks. I am updating the template here https://github.com/nwb-extensions/ndx-template/pull/78

@rly this is further along but I have many questions --- which are marked to TODO in the code. In particular,

  • How does the HedAnnotations class get at the HedVersion in the nwbfile without passing it or must it be passed in the constructor?

You can call .parent on the HedAnnotations object, and .parent on that object, and so on, until you get to the NWBFile object. From there, you can call .get_lab_meta_data("HedVersion"). We can add a convenience function within the HedAnnotations class to make that easy.

  • How can we assure that adding a trial column specifies that it is of HedAnnotations class -- I couldn't get the col_cls argument to work.

I'll look into that.

  • The HED version should probably be allowed to be a list of strings since usually more than one vocabulary is used. I don't think this is specified quite correctly in the class definition.

See above.

  • The HedVersion class should create a hedtools.HedSchema object in the constructor and make sure that it is valid. When is the HedVersion created?

I do not understand. I see that you added a custom HedVersion class. We can create a hedtools.HedSchema in the constructor there.

  • When should the HED strings in the HedAnnotations class be validated?

I think we should override the add_row method in the HedAnnotations class to validate HED strings when they are added to the column. This method is called when users add new rows to the parent table.

Thx!

rly commented 12 months ago

How can we assure that adding a trial column specifies that it is of HedAnnotations class -- I couldn't get the col_cls argument to work.

This is a current limitation of PyNWB/HDMF. Interestingly, we have not yet had a situation where a table column has a fixed name. DynamicTable.add_column requires a name argument. add_column then passes its arguments to HedAnnotations.__init__ which does not allow a name argument because it is fixed and should not be an argument.

I created a ticket on HDMF to fix this https://github.com/hdmf-dev/hdmf/issues/1002

VisLab commented 12 months ago

@rly I worked on this some more --- filling out the HedVersion and HedAnnotations class. However, the test to create a HedVersion class fails. I have attached an example of the error log. I'm finding the syntax a little obscure and am going to need some assistance. Thanks! sample_error_log.txt

oruebel commented 12 months ago

However, the test to create a HedVersion class fails.

Looking at the error, I think the issue is that HedVersion expects an array of stings and the test provides a single string object. I.e., I think if you replace HedVersion("8.2.0") with `HedVersion(["8.2.0", ]) should fix the problem.

rly commented 12 months ago

Yes, I believe that is the issue. To be more clear, this line:

@docval({'name': 'hed_version', 'type': (str, list), 'doc': 'HED strings of type str', 'shape': (None,)})

on HedVersion.__init__ checks the shape of the hed_version input argument to ensure it is 1-dimensional. If you pass a string in, the shape checker raises an error. I agree, that error is a bit cryptic and could be improved. So if you want HedVersion.__init__ to accept either a string or a list, then you cannot use 'shape': (None,) in the docval call. You can add code to validate the shape in the __init__ method itself though.

VisLab commented 12 months ago

I made the changes, but I'm getting the same errors about the container name. Ideas? sample_error_log.txt

rly commented 12 months ago

The main error now is LabMetaData.__init__: Expected at most 1 arguments ['name'], got 2: 0 positional and 2 keyword ['description', 'name']

This is because on line 98, kwargs is set to a dictionary with keys "name" and "description", and on the next line, LabMetaData.__init__(**kwargs) is called but LabMetaData.__init__ allows one argument, name.

To add that description to the HedVersion object, I would add a description attribute to the HedVersion spec. You can then set it to that fixed value in the spec, or allow it to be user-specified in __init__. I'm not sure a description is needed though.

VisLab commented 12 months ago

@rly @oruebel Thanks for your help --- I've made some progress -- but am going to need a lot more help. I am considering going ahead and merging so that you can see what the test errors. Here are some questions:

  1. Is it true that this builder infrastructure is new and only used for extensions (I'm looking for more examples)?
  2. Is there a difference between __fields__ and __nwbfields__ ? The events extension uses both.
  3. The basic test of HedVersion now works -- I made it just take an array of versions rather than a string, but the simple round-trip io fails --- I think because I haven't specified the types right so it doesn't know how to serialize (log1.txt).
  4. On the HedAnnotations which extends VectorData, there appears to be a data field in VectorData. Should HedAnnotations be using that rather than keeping a separate hed_strings field? (log2.txt)

log1.txt log2.txt

VisLab commented 11 months ago

@rly I rewrote the classes and now the tests run. There are numerous issues to be addressed. Part of the problem I think is my inability to force a particular name and class of a column. I have bypassed this issue temporarily by not forcing the issue.

None of the io tests work -- I think because I don't know how to write serialization code for the classes. I have looked through the code base, but I am going to need some help on this. Thx.

VisLab commented 11 months ago

I'm going to merge this even though it isn't nearly done so we can more easily discuss.