johari / minicell

(wip) A rich visicalc dialect with new datatypes inside cells. Recalc or die. 🏴‍☠️
1 stars 0 forks source link

Dataframes #86

Closed johari closed 3 years ago

johari commented 4 years ago

CSV and the relational model are examples of dataframes.

Spreadshetes

Dataframes in spreadsheets facilitate

Dataframes and spreadsheets

Is there any paper or article that hints to first-class support for dataframes inside cells?

Related work

Streaming dataframes

Some data never stops. It arrives continuously in a constant, never-ending stream. This happens in financial time series, web server logs, scientific instruments, IoT telemetry, and more. Algorithms to handle this data are slightly different from what you find in libraries like NumPy and Pandas, which assume that they know all of the data up-front. It’s still possible to use NumPy and Pandas, but you need to combine them with some cleverness and keep enough intermediate data around to compute marginal updates when new data comes in. (https://matthewrocklin.com/blog/work/2017/10/16/streaming-dataframes-1)

Apache Spark

DataSpread

Dataspread combines the intuitiveness and flexibility of spreadsheets and the scalability and power of databases (http://dataspread.github.io/)

Apache arrow

cuDF

External links

johari commented 3 years ago

Closing this issue for now as dataframes are not the main focus for now. I'm still curious about what would be a set of good core primitives to manipulate data frames. Perhaps it'll be a set of formulas that will mimic basic SQL queries and/or pandas API.