-
I was working on a datasource plugin recently and realized that while we have some [really great documentation on dataframes](https://grafana.com/docs/grafana/latest/developers/plugins/data-frames/) t…
-
**Describe the problem you faced**
When trying to use the [observe](https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.observe.html) function on datafra…
-
The big advantage of using this over python would be speed, and also that it could be made completely type-safe. If the user knows the rows/columns ahead of time (which is most likely the case) I woul…
-
-
Found a few cases where datacompy returns mismatches with Spark dataframe comparisons when the data is not sorted (using v0.11.3)
Cases where it reports mismatches when it shouldn't:
1. Column h…
-
### Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pypi.org/project/polars/) of Polars.
### Reprodu…
-
- Preprocessing
- Parameter extraction
- Quality assessment
- Output tables (Parameters to dataframe, quality to dataframes, ...)
- Check output table integrity (number of runs/setting, ...)
- S…
-
**Is your feature request related to a problem? Please describe.**
When the data size is quite large, many times we might need to use larger than RAM data. Also, using an engine like Polars will spee…
-
### System Info
OS version: Windows 11 pro
Python version: 3.9
The current version of pandasai being used: 2.2
### 🐛 Describe the bug
Hi @gventuri, hope this message finds you well.
I am here …
-
Currently, data frames can be split by random splits.
I would like the following types of splits:
1. Row based data frame split/reweighting
a. by column (i.e. Split data frame by where column m…