Open devinrsmith opened 2 months ago
The code to create an empty catalog was based on the java iceberg quickstart, https://iceberg.apache.org/docs/1.6.0/java-api-quickstart/#using-a-hadoop-catalog.
import org.apache.hadoop.conf.Configuration
import org.apache.iceberg.PartitionSpec
import org.apache.iceberg.Schema
import org.apache.iceberg.Table
import org.apache.iceberg.catalog.TableIdentifier
import org.apache.iceberg.hadoop.HadoopCatalog
import org.apache.iceberg.types.Types
// Adapted from https://iceberg.apache.org/docs/1.6.0/java-api-quickstart/#using-a-hadoop-catalog
Configuration conf = new Configuration()
String warehousePath = "file:///tmp/my_warehouse"
HadoopCatalog catalog = new HadoopCatalog(conf, warehousePath)
Schema schema = new Schema(
Types.NestedField.required(1, "level", Types.StringType.get()),
Types.NestedField.required(2, "event_time", Types.TimestampType.withZone()),
Types.NestedField.required(3, "message", Types.StringType.get())
// DH doesn't support LIST yet.
// Types.NestedField.optional(4, "call_stack", Types.ListType.ofRequired(5, Types.StringType.get()))
)
PartitionSpec spec = PartitionSpec.builderFor(schema)
.hour("event_time")
.identity("level")
.build()
TableIdentifier name = TableIdentifier.of("logging", "logs")
Table table = catalog.createTable(name, schema, spec)
produces these files
$ find /tmp/my_warehouse -type f
/tmp/my_warehouse/logging/logs/metadata/v1.metadata.json
/tmp/my_warehouse/logging/logs/metadata/.v1.metadata.json.crc
/tmp/my_warehouse/logging/logs/metadata/version-hint.text
/tmp/my_warehouse/logging/logs/metadata/.version-hint.text.crc
Trying to read a table that has been created via the catalog, but doesn't have any snapshots, produces an NPE as opposed to an empty table (of the appropriate schema):