Closed slim-bean closed 2 years ago
https://github.com/polarsignals/frostdb/tree/main/dynparquet might be what you are looking for ?
To get a dynamic (simple) schema, I am currently using a snippet like this:
structFields := []reflect.StructField{}
for _, field := range fields {
tag := fmt.Sprintf(`parquet:"%v,optional,plain"`, field.name)
var tp reflect.Type
switch field.type {
case "int":
x := int64(0)
tp = reflect.TypeOf(&x)
case "string":
[...]
}
structFields = append(structFields, reflect.StructField{
Name: strings.ToUpper(field.name),
Type: tp,
Tag: reflect.StructTag(tag),
})
}
structType := reflect.StructOf(structFields)
structElem = reflect.New(structType)
schema = parquet.SchemaOf(structElem.Interface())
HTH
Thanks for the helpful references @Pryz and @sdressler!
I was able to make this work!
I was really close before, I got tripped up by not sending the .Interface()
value to the parquet.SchemaOf()
method.
I'd like to generate parquet files dynamically at runtime.
Example I have a
[]map[string]string
and i'd like to turn the map keys into columnsFor the array of maps I can come up with a consistent set of keys to become columns, then basically each map of my slice of maps becomes a row where I'll pull the values out and write them to a row.
I tried building a struct at runtime using a whole mess of
reflect.
code but this was kind of gnarly and also didn't work. (the SchemaOf methods also do a lot of reflection and I couldn't make anything work out of this)The
Schema
type has a really limited set of constructors, I'm wondering thoughts on supporting this kind of functionality, perhaps through some new constructors for theSchema
that let you set the column information?