NeurodataWithoutBorders / matnwb

A Matlab interface for reading and writing NWB files
BSD 2-Clause "Simplified" License
49 stars 32 forks source link

matnwb does not support spaces in names #35

Open bendichter opened 6 years ago

bendichter commented 6 years ago

Since names are used as field names of objects, matnwb does not allow them to have spaces. pynwb does allow spaces. Maybe a compromise would be to automatically replace all spaces with underscores in matnwb.

lawrence-mbf commented 6 years ago

I believe spaces work with open Sets (basically, any Group structure that can have 0 or more items in it) as they are Map-backed. The only other place that could include spaces would be at the YAML level (core schema and extensions) which will definitely require some work.

Is there a bug where this doesn't work, or are you experiencing a use case that breaks this?

bendichter commented 6 years ago

Yes, the problem I was having with specifically with the names in an extension YAML, but I think the best solution would be to enforce no spaces on the pynwb side

lawrence-mbf commented 6 years ago

So I see two thing needing work to get extensions with spaces working:

  1. Add a check when generating files to replace spaces in property values. This should be fairly simple.
  2. Do a check before reading a compound data type, recreate the compound type using new member names, then reading it as normal.

This will "allow" spaces for class properties and makes compound data types readable if they have spaces.

I'll look into this.

ehennestad commented 2 weeks ago

@bendichter Is this still relevant?

If yes, can you clarify what is meant here:

Since names are used as field names of objects Does the schema allow for spaces in object property names?

Could you also point me to an example scenario which illustrates this problem?

bendichter commented 2 weeks ago

@ehennestad Yes, this issue is still relevant. Let me explain with a concrete example:

When creating a trials table in MatNWB, you use dot notation like this:

nwbfile.trials = DynamicTable(...)

This syntax pattern appears whenever you're working with Datasets or Containers in the schema that have set names. While this works fine for the core schema (which doesn't have this issue), we run into a potential problem when set names contain spaces.

The issue doesn't occur in the core schema, but it's not currently enforced as a rule, meaning it could appear in extensions. If an extension defines a Container or Dataset that has a sub-Container or Dataset with a space in its set name, we hit a compatibility problem.

There are two potential solutions we could consider:

  1. Enforce a rule that prohibits spaces in set names in the hdmf schema language
  2. Implement a workaround in MatNWB (for example, automatically replacing spaces with underscores)

While I can't recall the specific instance where we encountered this, it remains a potential issue that we should address.