-
### Am I using the newest version of the library?
- [X] I have made sure that I'm using the latest version of the library.
### Is there an existing issue for this?
- [X] I have searched the existin…
-
### Apache Iceberg version
1.4.3
### Query engine
Spark
### Please describe the bug 🐞
Spark config and code -
```
iceberg_rest = {
"spark.sql.extensions": "org.apache.iceberg.spa…
-
## Expected behavior
I want to use Apache Sedona in pyspark in an AWS glue environment.
## Actual behavior
The sedona librarie does not work when following the steps described in the doc : https:…
-
Hi! When using pyspark to read and save query with limit 1 million rows everything working fine. However when I try to set the limit up to 10 million rows, for example, got this error on the same quer…
-
[seniorityMapping&seniorityLevelDefinitions_V8.xlsx](https://github.com/abhishekrk/test/files/1237287/seniorityMapping.seniorityLevelDefinitions_V8.xlsx)
[seniorityMapping&seniorityLevelDefinitions…
-
I am trying to run catboost on pyspark but the box I am running the code in does not have internet so I cannot use ```spark.jars.packages``` config so I downloaded the jar file (catboost-spark_2.12-…
-
Hi, I have gone through the tutorial and would like to try pyspark on hdfs. I notice pyspark is pre-installed (2.0.x). But it doesn't support the pre-installed python (version 2.6.6). To make it work,…
-
### Problem Description
### Expected behavior
### Additional context
-
My team wants to use py-spark in Kubeflow pipeline nodes.
This py-spark pipeline node is communicating with a completely independent MinIO instance and runs ANSI SQL commands to it.
When we create…
-
### Problem Description
SDV is AWESOME! And one of the very few players in this space to be able to handle mutli-tables.
However, it is quite limited with sklearn as a backend. What would it tak…