lincc-frameworks / nested-pandas

Efficient Pandas representation for nested associated datasets.
https://nested-pandas.readthedocs.io
MIT License
4 stars 0 forks source link

nested-pandas

Template

GitHub Workflow Status codecov Read the Docs benchmarks

An extension of pandas for efficient representation of nested associated datasets.

Nested-Pandas extends the pandas package with tooling and support for nested dataframes packed into values of top-level dataframe columns. Pyarrow is used internally to aid in scalability and performance.

image

Nested-Pandas is motivated by time-domain astronomy use cases, where we see typically two levels of information, information about astronomical objects and then an associated set of N measurements of those objects. Nested-Pandas offers a performant and memory-efficient package for working with these types of datasets.

Core advantages being:

This is a LINCC Frameworks project - find more information about LINCC Frameworks here.

Acknowledgements

This project is supported by Schmidt Sciences.