Closed prasburst closed 4 months ago
Hi,
This is done by iceburst which is one of our core value proposition.
From: Sutou Kouhei @.> Sent: Saturday, February 10, 2024 10:07:09 PM To: apache/arrow-site @.> Cc: prasburst @.>; Author @.> Subject: Re: [apache/arrow-site] Add iceburst to powered by list (PR #474)
@kou commented on this pull request.
In powered_by.mdhttps://github.com/apache/arrow-site/pull/474#discussion_r1485483982:
@@ -129,6 +129,10 @@ short description of your use case. natural language processing, and tabular tasks. Dataset objects are wrappers around Arrow Tables and memory-mapped from disk to support out-of-core parallel processing for machine learning workflows. +* [iceburst][53]: A real-time data lake for monitoring and security built
- directly on top of Amazon S3. Our approach is simple: ingest the OpenTelemetry data in an S3 bucket as
- Parquet files in Iceberg table format and query them using DuckDB with milliseond retrieval and zero egress cost.
- Parquet is converted to Arrow format in-memory enhancing both speed and efficiency.
Is this done by DuckDB or iceburst? If you mean that DuckDB does it, it may be wrong. I think that DuckDB doesn't use Apache Arrow as its internal data format.
— Reply to this email directly, view it on GitHubhttps://github.com/apache/arrow-site/pull/474#pullrequestreview-1874307769, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BB5Q3G3G3I52NQQILIWX233YTBNY3AVCNFSM6AAAAABDDD7XTGVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTQNZUGMYDONZWHE. You are receiving this because you authored the thread.Message ID: @.***>
Does iceburst use DuckDB's Arrow integration feature https://duckdb.org/2021/12/03/duck-arrow.html ?
Yes, a lot of work is made easy because of the zero copy integration.
We export the query results to an Arrow table using the arrow
function. Some cases, especially on aggregation queries made using the relational API of DuckDB, we use the to_arrow_table
function to export the query results and save everything in Arrow format in-memory.
Here's a reference to Arrow export: https://duckdb.org/docs/guides/python/export_arrow
Included details about iceburst.io