Generate iceberg metadata file based on _spark_metadata

apache / iceberg

Apache Iceberg

https://iceberg.apache.org/

Apache License 2.0

6.39k stars 2.21k forks source link

Generate iceberg metadata file based on _spark_metadata #9270

Open tanejagagan opened 10 months ago

tanejagagan commented 10 months ago

Currently Spark Structured streaming writes metadata files inside _spark_metadata directory with following content in json format {code} { path : String, size : Long, isDir : Boolean, modificationTime : Long, blockReplication: Int, blockSize: Long, action : String } {code} IceBerg should provide a utility to generate _iceberg_metadata files based on the content present in _spark_metadata directory. This will also help preserve the snapshots of the given table

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.