hdmf-dev / hdmf

The Hierarchical Data Modeling Framework
http://hdmf.readthedocs.io
Other
46 stars 26 forks source link

Support multiple DynamicTableRegion when converting to a hierarchical dataframe #649

Open oruebel opened 3 years ago

oruebel commented 3 years ago

Problem: The function hdmf.common.hierarchicaltable.to_hierarchical_dataframe currently only supports resolution of one DynamicTableRegion column per DynamicTable that is linked to in the table hierarchy. I.e., the function follows and resolves the first DynamicTableRegion found in a table but if any given table contains additional DynamicTableRegion columns, then those will be converted as nested pandas.DataFrame objects.

Possible Solutions:

  1. hdmf.common.hierarchicaltable.to_hierarchical_dataframe should support resolution of multiple DynamicTableRegion columns for each given table.
  2. Add a new function that instead of working on DynamicTable would work on pandas.DataFrame objects and allow resolution of an arbitrary number of user-defined columns. There should then also be an option to automatically find columns that need resolution to allow resolution of all columns at once.

Possible Challenges

Checklist

oruebel commented 3 years ago

I added the "good first issue" label mainly because its an issue that is focused on a particular area (i.e., DynamicTable) and remains mostly on the surface (i.e., requires using the API and building new features on top). However, this is not necessarily a trivial issue as it requires some tricky data wrangling with lots of edge cases. That being said, it's a good issue for someone who wants to dive deeper into DynamicTable logic.

mavaylon1 commented 5 months ago

@oruebel Could you take this? If not, could you make me an example to run and I will take it.

oruebel commented 5 months ago

Let's see if I find time to work on this during the Dev Days. Otherwise, I'd leave it for Future for now when we actually need this.