-
Hello guys,
Before I begin, I want to thank you for this amazing tool. It's truely awesome and enables true distributed deep learning out of the box. Now for some questions.
1. From what I've seen I …
-
**Is your feature request related to a problem? Please describe.**
Many sources for spatial data are formatted and provided as file geodatabases (.gdb). Currently, there are no spark-native utilitie…
-
TL;DR:
**How does one zero-copy convert a PySpark dataframe to a Modin dataframe?**
I am currently searching for a way to manipulate PySpark dataframes without materializing them as a Pandas dataf…
-
### Apache Iceberg version
1.0.0
### Query engine
Spark
### Please describe the bug 🐞
Hi team,
i'm currently using apache spark 3.3 with apache-iceberg 1.0.0 and AWS S3 and GlueCatalog-integra…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Current Behavior
When running v2 excel pySpark code below in Databricks 11.3 LTS Runtime:
df = spark.…
-
**Query/Question**
I am looking to perform operations across databases and containers to process a large data dump.
Here is the situation,
1. I receive a data dump(large with millions of records)…
-
The function should append the data if the `append_df` has a schema that matches the `df` exactly.
If the schema doesn't match exactly, then it should error out.
This "schema safe append" could …
-
Runtime storage layer currently has an abstraction similar tto the platform data service component. However, Apache Arrow already have their own file system abstraction with built in support for AWS, …
-
**Is your feature request related to a problem? Please describe.**
A common question we get, is does Hamilton support spark dataframes? The answer is yes, but it's not ideal at the moment, and we don…
-
### Describe the feature
Add a forth connection option to the dbt-adapter. This forth connection would create a pyspark context and utilize the `spark.sql()` function to execute sql statements.
##…