Open fpj opened 3 years ago
@fpj yes, we have plans to add iceberg. can you share your use case a bit more?
hi @dborkar , I'm looking forward to seeing some activity around it. I don't have a specific use case, I work on storage systems and I have been playing working on connectors for Presto. We will most likely start a conversation with the community soon about our work.
That great to hear. once we have an early design we'll post it out for feedback as well @fpj Interested in hearing on the connectors you are building.
Tagging a few more people here who have done a lot of work on connectors / plugins: @zhenxiao @beinan @highker @ashishtadose
BTW, @fpj jump on Presto slack http://slack.prestodb.io/ - this is different from TrinoDB.
Ok, will do. I'm wondering about one thing: should I close this issue? My question has been answered, but it might be a good idea perhaps to repurpose the issue or create another issue to track the Iceberg work? How do you do these things in this community?
@fpj it's fine to keep it open so that community can share further updates here.
@fpj we're getting this switched to be called Trino to avoid any confusion about Presto having this connector yet. It's been merged but the site hasn't been rebuilt yet.
For those looking for Iceberg support now, it exists in what used to be called PrestoSQL, which has recently been renamed to Trino: https://trino.io/blog/2020/12/27/announcing-trino.html.
Join our slack if you have any further questions: https://trino.io/slack.html
@bitsondatadev Thanks for the input, that was my understanding from the beginning, but from the post, the community has split and diverged. I'm currently working off this repository here, that's why I'm interested in Iceberg support for PrestoDB, not Trino.
Understood @fpj we just want that to be clear if anyone is looking for iceberg support.
tagging @ChunxuTang He is working on the iceberg connector, I think
Yeah, I'm working with @zhenxiao on the iceberg connector. Will send a PR for a review.
Looking forward to the PR, @ChunxuTang.
@fpj Thanks for raising the issue. Turned out multiple efforts were underway to integrate Iceberg 😄
Saved duplicate efforts!
Feels like a decent first community contribution...
Yeah, I'm working with @zhenxiao on the iceberg connector. Will send a PR for a review.
Hi chunxu, how is this work going, we are looking forward this fantastic feature!
@dixingxing0 The implementation goes well. Plan to send the PR very soon~
@dixingxing0 The implementation goes well. Plan to send the PR very soon~
Glad to see this PR!
@ChunxuTang ,We need Presto iceberg connector very much now. I'm very glad to see that this work is in progress. I don't know when it will be released and where I can track the progress?
Hi folks, the PR has been merged to the Presto codebase. Feel free to close the issue~
Hi @ChunxuTang , appreciate your work in this space, but it does not seem like it's quite working yet. By default presto-iceberg is not included in the downloadable presto-server tgz file. When built from source and dropped in to the plugins directory of Presto 0.256 the following exception occurs when running a trivial INSERT statement (inserting 1 integer into an unpartitioned iceberg table with 1 integer column):
java.lang.NoSuchMethodError: org.apache.parquet.schema.PrimitiveType.getLogicalTypeAnnotation()Lorg/apache/parquet/schema/LogicalTypeAnnotation;
at org.apache.iceberg.parquet.MessageTypeToType.primitive(MessageTypeToType.java:137)
The problem seems to be that hive-apache-3.0.0-3.jar is conflicting with parquet-column-1.11.0.jar. This getLogicalTypeAnnotation() method was introduced in parquet-mr 1.11.0 but until issue #14960 is merged, Presto is on parquet-mr 1.10.1.
I got it to work by renaming hive-apache-3.0.0-3.jar to zzhive-apache-3.0.0-3.jar . This lets parquet-mr 1.11.0 override 1.10.1 (unsafely).
I may have spoken too soon on this. With the above workarounds I'm running into "Not an Iceberg table" errors quite frequently when trying to SELECT the data back in. Overall it feels like there's a bit of a gap between what's possible and what's reasonably achieved out of the box by following the documentation.
Edit: "Not an Iceberg table" errors after INSERTs are gone in 0.257 snapshot, fixed by commit 14ad556876ead826069781bd6471855320b05815
I'm wondering if there is any plan to support Iceberg tables. I see that there is a Presto connector available, but it points to the Trino documentation.
https://iceberg.apache.org/presto/