-
### Describe the bug, including details regarding any error messages, version, and platform.
So far I've only been able to reproduce this case with `pyspark` but I think the bug is probably on the …
-
This is available already at the file level:
https://github.com/apache/parquet-cpp/blob/master/src/parquet/file/metadata.h#L177
but not at the ColumnChunk level
**Reporter**: [Wes McKinney](https:/…
-
I'm using parquet on Windows 10 and I have two different parquet files for testing, one is snappy-compressed, one is not compressed.
Simple test code for reading:
with open(filename,'r') as …
-
**Describe the problem you faced**
Original issue: https://github.com/trinodb/trino/issues/15368
> Our team is testing the same on COPY ON WRITE HUDI (0.10.1) tables with metadata enabled at vers…
-
I have a high level design question concerning using text as a serialized representation of array metadata. In my opinion, it is not the best choice as a primary representation. Let me explain why
…
-
### Spark-Bench version (version number, tag, or git commit hash)
spark-bench_2.3.0_0.4.0-RELEASE
### Details of your cluster setup (Spark version, Standalone/Yarn/Local/Etc)
Centos 7.4
HDP-2…
-
Is it not possible to use standard dplyr operations within a spark_apply function?
Please let me know if below example works for anyone else or if there's an obvious mistake in my code.
Thanks!
…
-
### Backend
VL (Velox)
### Bug description
使用的jar:**gluten-velox-bundle-spark3.2_2.12-centos_7-1.0.0.jar**
我使用的hdfs集群是开启了kerberos认证的,按照文档所说做了配置
--conf spark.executorEnv.LIBHDFS3_CONF="hdfs-…
-
### What happened + What you expected to happen
Using PyArrow fs with HDFS works fine outside a ray session:
```
file_sys, file_path = pyarrow.fs.FileSystem.from_uri(hdfs_folder)
file_infos = fi…
-
Created a policy which pulls message from Kafka topic and extract one field of out of the message and push it to parquet file. Policy was created fine but it couldn't succeed. Below is the error and a…