-
**TODOs:**
- ~~Have script organised in folder according to the language used to implement them~~
- Correctly organise the configuration files
- [POSTPONED] _Use Dataframe for now, then search for a m…
-
For each unique vertex and edge alias in a motif, GraphFrames currently wraps the entire vertex and edge dataframes, respectively, in aliased structs.
Spark's query planner operates on the entire c…
-
* https://levelup.gitconnected.com/100x-faster-data-processing-in-javascript-923215f34c00
* https://codesandbox.io/examples/package/nodejs-polars
* https://duckdb.org/docs/guides/python/polars.html
…
-
Not sure how valid this is (is this truly within the scope of this library?) or in what form this will rear its ugly head, but it would be neat to add some complimentary functions for Spark. This is a…
-
Hi, I've just found this library and it seems great, but wanted to quickly double-check if it's applicable to my use case. Namely, I have a large amount of tabular data stored in Spark DataFrames (so …
-
Do you know about anyone working on the support for [Spark DataFrames](https://spark.apache.org/docs/latest/sql-programming-guide.html#datasets-and-dataframes)? Are there any public plans on doing so?…
-
This is a request about using CellPy on a cloud platform, and specifically for using Unity Catalog for data Governance, which is useful for example if you want to use databricks. Unity Catalog uses Ap…
-
Maybe I'm the outlier, but I consider the more intuitive check -- especially for testing purposes -- to ignore order. If some function produces a DataFrame that I want to check, I care about the conte…
-
It would be good to support some other aggregations beyond the ones that ship with default Spark DataFrames, maybe look into a streaming median (or non-streaming median / quartiles?) as a place to sta…
-
See https://stanford.edu/~rezab/papers/linalg.pdf
Wouldn't be much work to create a `sparkmatrix` extension with support for converting DataFrames into [IndexedRowMatrix](http://spark.apache.org/do…