rly / ndx-pose

NWB extension to store pose estimation data
BSD 3-Clause "New" or "Revised" License
13 stars 11 forks source link

naming convention for default names #31

Open oruebel opened 4 months ago

oruebel commented 4 months ago

Could you clarify the convention for assigning the default_name values in this extension. Some of the types seem to use camel-case, e.g., default_name: PoseTraining , while other types use all-lowercase , e.g., default_name: skeleton_instance.

rly commented 4 months ago

The NWB Best Practices recommend that a data type be named the same as the name of the data type, e.g., PoseTraining objects should be named "PoseTraining". However, when a data type is contained within another type in a 1:1 relationship, e.g., a TrainingFrame contains a single SkeletonInstances, in the APIs and schema, NWB uses the convention that child objects are named with all lower case, e.g., skeleton_instances. There is therefore a conflict between conventions.

An analogous case is the Units table. The default name is Units. The field in NWBFile is named units. Therefore, the Units table within an NWBFile must be named units. I also explained this in https://github.com/NeurodataWithoutBorders/pynwb/issues/1882

This conflict between conventions causes inconsistencies and confusion, especially when users create extensions, and is a pain to deal with. I don't have a good solution except to use lower case when an object is intended to be a named field in another data type and camel case otherwise. It's not ideal.

Some other options:

  1. Change the best practice to recommend the use of all lower case naming for all types, but this change may confuse people temporarily
  2. Change the convention for the name of a data type when it is contained within another data type to be camel case naming. This will create inconsistencies
  3. Do not use a fixed name for a data type when it is contained within another data type, but then programmatic access to the child object is harder - one needs to look at all groups/datasets and evaluate the data type and ignore the name
oruebel commented 4 months ago

Thanks for the clarification.

  1. Change the best practice to recommend the use of all lower case naming for all types, but this change may confuse people temporarily

I think we usually use lowercase names for all fixed names in NWB (e.g., processing etc.) so this option seems consistent with naming in the core NWB schema.

bendichter commented 4 months ago

Another aspect of this is the name assigned in the default arguments of some PyNWB classes, e.g. the default name for the Position class is "Position"

bendichter commented 4 months ago

snake_case reflects the Python convention of having classes be UpperCase and instances be snake_case, however I think that convention is pretty Python-specific, and this differentiation is less important here, where there isn't such a danger of confusing class objects with instances of the class, so it makes more sense to just stick with the same name. It does create a conflict of convention when the class is a field of another class.

oruebel commented 4 months ago

Makes sense. In the Python analogy, every object in a file is an instance, the equivalence of the class is the specification of the neurodata type in the schema. In that sense, using snake case for names seems most consistent.

pauladkisson commented 3 months ago

So it sounds like if we change the best practice to snake_case naming for all objects (option 1) we would avoid the conflict between stand-alone objects and child objects AND follow python naming convention (making it more intuitive for python users), at the cost of temporary confusion for people already accustomed to CamelCase.

Seems like a reasonable option.