parquet-tools Search Results

1000+ results
for parquet-tools

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

apache/parquet-java #2552

TransCompressionCommand Inoperable

`TransCompressionCommand` in parquet-tools is intended to allow translation of compression types in parquet files. We are intending to use this functionality to debug a corrupted file, but this comma…

asfimport updated 3 years ago
2
outrauk/dataiku-plugin-snowflake-hdfs #12

sync_snowflake_to_hdfs recipe doesn't handle dates correctly

Snowflake tables with date and timestamp columns (with timezone or not), when synced back to Dataiku via `sync_snowflake_to_hdfs`, are imported as `int` and `bigint`, respectively. This appears to be …

mklaber updated 4 years ago
2
ezrosent/frawk #91

support for parquet files

This might sound crazy but still I wanted to propose a feature request about parquet files. You might ask, why? Parquet files are becoming more widespread and might even be considered as "the new …

alperyilmaz updated 8 months ago
3
rom1504/clip-retrieval #128

Investigate what format to use to store embeddings+id

Current format : * Numpy matrix + parquet for IDs (ordered collections) * parquet with embeddings+id Numpy+parquet : Benefit: * Fast to read numpy alone * fast to read parquet alone Drawbac…

rom1504 updated 10 months ago
1
facebookincubator/velox #7617

Parquet reader error "Write past Buffer capacity() 4000"

### Bug description Expected behavior: able to read the parquet file with array type that contains 30000 empty arrays. Both parquet-tools and presto parquet reader are able to read the file ``` …

qqibrow updated 4 months ago
7
apache/parquet-java #2550

Parquet File not readable by Google big query (works with Sp…

Hi I'm trying to write Avro message to parquet on GCS. These parquet should be query by big query engine who support now parquet. To do this I'm using Secor a kafka log persister tools from pinter…

asfimport updated 3 years ago
8
pola-rs/polars #18907

Issue reading S3 files

### Checks - [X] I have checked that this issue has not already been reported. - [X] I have confirmed this bug exists on the [latest version](https://pypi.org/project/polars/) of Polars. ### Re…

stevenmanton updated 1 month ago
2
ironSource/parquetjs #125

Streaming new records into an existing parquet file in S3

I'm attempting to aggregate records by id as they are processed from SQS via lambda into S3. I do get a merged file uploaded into S3 as I can see the filesize increasing each time the lambda runs, …

designreact updated 10 months ago
5
typelevel/frameless #282

Use Avro as schema in TypedDataSet

Would it make sense to be able to introduce support for `avro` schema for `TypedDataSet`? The current code defines schema based on the `SparkSQL` "language": https://github.com/typelevel/frameless…

SemanticBeeng updated 5 years ago
6
crate/crate #15955

Support Parquet as export file format for `COPY TO`

### Problem Statement CrateDB's current export functionality is limited to JSON format, which results in the loss of type information and suboptimal data handling. The COPY TO command lacks support…

proddata updated 6 months ago
2

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for parquet-tools

1000+ results
for parquet-tools