Open replikeit opened 1 day ago
Parquet example:
❯ parquet meta ~/Downloads/tokens_parquet2_ethereum_tokens_000000000000.parquet 20:23:49
File path: /Users/alinaglumova/Downloads/tokens_parquet2_ethereum_tokens_000000000000.parquet
Created by: parquet-cpp-arrow version 13.0.0
Properties: (none)
Schema:
message schema {
required binary address (STRING);
optional binary symbol (STRING);
optional binary name (STRING);
optional binary decimals (STRING);
optional binary total_supply (STRING);
required int64 block_timestamp (TIMESTAMP(MICROS,false));
required int64 block_number;
required binary block_hash (STRING);
}
Row group 0: count: 1068 159.64 B records start: 4 total(compressed): 166.502 kB total(uncompressed):166.502 kB
--------------------------------------------------------------------------------
type encodings count avg size nulls min / max
address BINARY _ _ R 1068 47.51 B 0 "0x005c97569a24303e9ba6de6..." / "0xffffe5b9cb42b4996997c92..."
symbol BINARY _ _ R 1068 6.21 B 10 "" / "��"
name BINARY _ _ R 1068 10.27 B 10 "" / "����������"
decimals BINARY _ _ R 1068 0.47 B 65 "0" / "9"
total_supply BINARY _ _ R 1068 6.24 B 9 "0" / "9999999999999999999900000..."
block_timestamp INT64 _ _ R 1068 9.32 B 0 "2024-02-08T15:50:47.000000" / "2024-09-11T06:57:23.000000"
block_number INT64 _ _ R 1068 9.32 B 0 "19184445" / "20725728"
block_hash BINARY _ _ R 1068 70.31 B 0 "0x0002376d87ff1bbe5310679..." / "0xffae2542617a1ee9204fb27..."
What version of the Stream Reactor are you reporting this issue for?
Release 8.1.4
Are you running the correct version of Kafka/Confluent for the Stream Reactor release?
I am running on Aiven Apache Kafka 3.8.0. My Kafka Connect is deployed using Strimzi on Kubernetes.
Do you have a supported version of the data source/sink .i.e Cassandra 3.0.9?
Yes, I am using GCS (Google Cloud Storage) as the data source and Kafka as the sink.
Have you read the docs?
Yes, I have read the documentation.
What is the expected behaviour?
I expect the connector to transfer Parquet files from GCS to a Kafka topic.
What was observed?
I encountered the following error:
java.io.EOFException: Reached the end of stream with 8861 bytes left to read
What is your Connect cluster configuration (connect-avro-distributed.properties)?
What is your connector properties configuration (my-connector.properties)?
Please provide full log files (redact and sensitive information)