pyro-ppl / brmp

Bayesian Regression Models in Pyro
Apache License 2.0
70 stars 8 forks source link

Multidimensional datasets with xarray #46

Open eb8680 opened 4 years ago

eb8680 commented 4 years ago

Currently brmp expects data in the form of Pandas dataframes, but when working with larger, higher-dimensional datasets it might be more convenient to specify formulas in terms of multidimensional xarray Dataset objects.

This would probably require substantial changes to brmp's code generation infrastructure, however, and is not high priority.

neerajprad commented 4 years ago

Looking at http://xarray.pydata.org/en/stable/pandas.html, I think we should be able to swap pandas for xarray internally in the future.

null-a commented 4 years ago

This would probably require substantial changes to brmp's code generation infrastructure,

The design matrix coding module is the only place we work with data frames, and I don't think we do much more than fetch columns by name, check column types (categorical, numeric, etc.), and fetch a list of levels present in a categorical column. So switching to working with xarray or some generic interface will hopefully be relatively painless.