spark-dataframes Search Results

1000+ results
for spark-dataframes

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

maxpumperla/elephas #15

Massive dataset + data_generator

Hello guys, Before I begin, I want to thank you for this amazing tool. It's truely awesome and enables true distributed deep learning out of the box. Now for some questions. 1. From what I've seen I …

AntreasAntoniou updated 1 year ago
21
databrickslabs/mosaic #141

New Feature - Add Support for File Geodatabase IO

**Is your feature request related to a problem? Please describe.** Many sources for spatial data are formatted and provided as file geodatabases (.gdb). Currently, there are no spark-native utilitie…

armckinney updated 1 year ago
1
oap-project/raydp #86

raydp.modin module to integrate Modin?

TL;DR: **How does one zero-copy convert a PySpark dataframe to a Modin dataframe?** I am currently searching for a way to manipulate PySpark dataframes without materializing them as a Pandas dataf…

Hoeze updated 1 year ago
7
apache/iceberg #6388

Spark Structured Streaming - Cannot invoke "org.apache.icebe…

### Apache Iceberg version 1.0.0 ### Query engine Spark ### Please describe the bug 🐞 Hi team, i'm currently using apache spark 3.3 with apache-iceberg 1.0.0 and AWS S3 and GlueCatalog-integra…

ottensjors updated 1 year ago
1
nightscape/spark-excel #682

[BUG] Cannot read files into dataframe in Databricks 11.3 LT…

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior When running v2 excel pySpark code below in Databricks 11.3 LTS Runtime: df = spark.…

james-miles-ccy updated 1 year ago
12
Azure/azure-sdk-for-java #30403

Is there a way to perform batch operations across databases …

**Query/Question** I am looking to perform operations across databases and containers to process a large data dump. Here is the situation, 1. I receive a data dump(large with millions of records)…

bhattacharyyasom updated 1 year ago
6
mrpowers-io/quinn #50

Create some "schema safe append" functionality

The function should append the data if the `append_df` has a schema that matches the `df` exactly. If the schema doesn't match exactly, then it should error out. This "schema safe append" could …

MrPowers updated 1 year ago
17
finos/tracdap #230

Use Apache Arrow for runtime storage layer

Runtime storage layer currently has an abstraction similar tto the platform data service component. However, Apache Arrow already have their own file system abstraction with built in support for AWS, …

martin-traverse updated 1 year ago
1
stitchfix/hamilton #84

Show pyspark dataframe support

**Is your feature request related to a problem? Please describe.** A common question we get, is does Hamilton support spark dataframes? The answer is yes, but it's not ideal at the moment, and we don…

skrawcz updated 1 year ago
1
dbt-labs/dbt-spark #305

[CT-431] Support for Pyspark driver

### Describe the feature Add a forth connection option to the dbt-adapter. This forth connection would create a pyspark context and utilize the `spark.sql()` function to execute sql statements. ##…

cccs-jc updated 1 year ago
22

上一页 1...60 61 62 63 64 65 66...100 下一页

1000+ results for spark-dataframes

1000+ results
for spark-dataframes