spark-dataframes Search Results

1000+ results
for spark-dataframes

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

databrickslabs/overwatch #369

schema - ensuring unique user-defined keys

Several Databricks fields are a sort of map such as `custom_tags`, `default_tags`, `spark_conf`, `env_vars`, etc. These key-value pairs wind up in the bronze layer first as a struct where each key is …

GeekSheikh updated 2 years ago
1
pola-rs/polars #3120

write_parquet function in polars-u64-idx does not support la…

#### What language are you using? Python #### What version of polars are you using? 0.13.21 #### What operating system are you using polars on? Ubuntu 20.04.1 LTS #### What language …

jnthnhss updated 2 years ago
25
AbsaOSS/spline #661

display spline metadata in Kafka topics

## Background [Optional] We are trying to visualize lineage (metadata) of a dataframes produced by spark. For this we have created a spark job (below is the code). ## Question We managed to read …

ErCoino updated 2 years ago
1
MicrosoftDocs/azure-docs #88451

Not a solution to the "cancel a single cell" problem

The solution provided by the team for the issue of not being able to execute other cells after cancelling out another cell (and having this one "frozen" for a long time) is not a solution good enough,…

GermanCM updated 2 years ago
4
sparklyr/sparklyr #3214

spark_read_delta does not benefit from partitions

I am using Databricks witha Delta Lake in the background. Databricks runtime is 10.2, with sparklyr 1.7.2. ```r sc % filter(yearMonth >= 201801) %>% filter(yearMonth < 202202) ``` is much faste…

Adrian-S-D updated 2 years ago
2
gchq/Gaffer #1916

Optimise SortFullGroup in ParquetStore by using repartitionB…

Spark 2.3 introduced a `repartitionByRange` option on dataframes. This could be used to improve the efficiency of `SortFullGroup` in the Parquet store (possibly avoiding the need to use RDDs, which co…

gaffer01 updated 2 years ago
1
databricks/spark-xml #579

ClassCastException occurs attempting to access dataframe, e.…

Using Spark version 3.2.1 Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 11.0.14.1) I load the xml files below. First one to establish the schema and the second with the actual insta…

chrisfw updated 2 years ago
16
AbsaOSS/spline #753

Can't initialize Lineage Tracking

## Background Have all of the pieces set up, Spline web UI, ArangoDB. Trying to run Spline with SBT/IntelliJ locally through a unit test and getting the below error: 20/07/13 20:20:11 ERROR Spark…

sidp-dev updated 2 years ago
5
NVIDIA/spark-rapids #927

[QST] Recommended approach for a reference avro writer/reade…

**What is your question?** This is a question to the spark team of rapids. As part of cuIO refactor, We(rapids cudf team) are currently working on adding fuzz testing coverage for our avro reader…

galipremsagar updated 2 years ago
6
aws/aws-sdk-pandas #1176

Command failed with exit code 10 on Glue Job

Hi everyone I programmed a processing of data on Jupyter Notebook (SageMaker) with the awswrangler library. This code work perfectly in this enviorement but when I try run it on Glue, the code fini…

kev-dfs updated 2 years ago
1

上一页 1...73 74 75 76 77 78 79...100 下一页

1000+ results for spark-dataframes

1000+ results
for spark-dataframes