Closed westurner closed 1 year ago
Ibis compared with Blaze + Dask would also be useful https://blaze.pydata.org/
The Blaze ecosystem is a set of libraries that help users store, describe, query and process data. It is composed of the following core projects:
- Blaze: An interface to query data on different storage systems
- Dask: Parallel computing through task scheduling and blocked algorithms
- Datashape: A data description language
- DyND: A C++ library for dynamic, multidimensional arrays
- Odo: Data migration between different storage systems
@westurner blaze, DyND, odo, datashape are long abandonware :-<
A few interesting tools to compare to in 2023:
DataFrame APIs:
Libraries that may draw comparisons but are not DataFrame APIs:
Dataclasses are or could be like DataFrames, though they don't do columnar storage and so:
FWIW, dataclasses -> arrow/pandas takes less ram than list(map(tuple, dataclasses_list))
https://pypi.org/project/pandas-dataclasses/
Since Ibis wraps computational engines, it doesn't really make sense to compare it to a bunch of different engines. We've added a "Why Ibis" page in #5958 that covers the points we think should be included in the docs. Closing.
Suggestions for updates to the homepage:
[ ] Add blaze / daskSQLAlchemy is one back end that Ibis can compile expressions using.
https://github.com/ibis-project/ibis/blob/master/ibis/sql/alchemy.py
Current: http://ibis-project.org/
Suggested:
A different final paragraph on the first page of the docs might be more welcoming.