-
### Describe the feature
Support Dask just as Spark is supported.
### Who will this benefit?
This will benefit realtime / web-request use cases where milliseconds matter. The same isomorphic Mac…
-
Fractional split feature of `Splitter` returns an undesired result when one tries to split a `pandas` dataframe with duplicated indices without passing any argument for `id_column`.
The following …
-
I'm using spark-bigquery-with-dependencies_2.11-0.21.1.jar and having trouble with reading BigQuery data from Spark on Yarn cluster.
Pipeline:
BigQuery -> Spark 2.3.2 with HDP 3.1.5 , Python 3.6 …
-
## Description
I think there's scope to create a series of data connectors that would allow Kedro users to connect to Snowflake in different ways. This usage pattern was identified in the kedro-org/k…
-
Hello,
Posting this from github (master @wesm asked for it :) )
```java
import pandas as pd
import numpy as np
import pyarrow.parquet as pq
import pyarrow as pa
idx = pd.date_…
-
## Description
Kedro-viz supports Plotly.
Plotly has cool tables -https://plotly.com/python/table/
the idea is simply show the first 5/10 rows of the dataset on Kedro-viz
### Implementa…
-
**Describe the bug**
When using a SchemaModel on a pyspark dataframe with the config option `strict = "filter"` set, a `TypeError: drop() got an unexpected keyword argument 'inplace'` is raised.
-…
-
This is a feature request / discussion issue to outline some problems we are having on RAPIDS cuML and, hopefully, converge on a good solution.
**Problem:** Dask Arrays & Dataframes are assumed to …
-
## Expected behavior
Want to do Distance Join Query between two dataframes. So followed the [documentation](https://datasystemslab.github.io/GeoSpark/tutorial/geospark-core-python/#write-a-distance-j…
-
xref https://github.com/pandas-dev/pandas/pull/28135#issuecomment-524659775
do we want to make pandas PEP 561 compatible?
https://mypy.readthedocs.io/en/latest/installed_packages.html#making-pep…