-
Parsing a column containing invalid json into StructureType with schema throws an Exception.
# Repro:
```
:~$ $SPARK_HOME/bin/spark-shell --master local[*] --jars ${SPARK_RAPIDS_PLUGIN_JAR}
--c…
-
**Describe the problem you faced**
I'm observing failed spark stages
`"Doing partition and writing data: " in " SparkUpsertCommitActionExecutor" job group. `
Effectively it can no longer insert…
kazdy updated
2 months ago
-
When I run `ParquetWriter.getDataSize()`, it works normally. But after I call `ParquetWriter.close()`, subsequent calls to ParquetWriter.getDataSize result in a NullPointerException.
```
java.lang.Nu…
-
### Describe the enhancement requested
`pyarrow.dataset.write_dataset(compression='lz4_raw')` currently fails with:
```
Traceback (most recent call last):
File "/work/projects/lisa/testpyarrow…
-
```
testDoubleNaNInfinity(io.prestosql.plugin.hive.parquet.TestFullParquetReader) Time elapsed: 0.424 s
-
### **Background:**
We are leveraging AWS security lake to ingest various log sources into OCSF, have this data be queryable via AWS Athena, as well as ingest this data into AWS OpenSearch. We are …
-
**Describe the problem you faced**
A flink write hudi job, we have hdfs jitter, cause flink task to fail over, and see this error
**To Reproduce**
Steps to reproduce the behavior:
*have ch…
-
If you see the following error below in Anypoint Studio after you add the dependency to your pom.xml file, you need to edit the pom.xml for the audience-annotations file.
You can find that here: `/…
-
We are unable to read in a parquet file that was created via DuckDB - polars, clickhouse, datafusion, pyspark, and pyarrow _are_ able to read this file. Upon further investigation, it appears to be a …
-
### Query engine
I am using [Parquet4S](https://github.com/mjakubowski84/parquet4s) + Hadoop-AWS
Parquet4S is a wrapper on the Apache Parquet library and allows reading/writing of Parquet data to ob…