I recently made a post on the mailing list but I thought this might make more sense as a location to communicate. I am using Arrow for Go mostly to read and write Parquet and IPC files. Often I would like to use the very helpful schema.NewSchemaFromStruct() from github.com/apache/arrow/go/v11/parquet/schema. However naturally then in my code, I would like to build an Arrow record using this schema, something like this:
var obj []Test
pool := memory.NewGoAllocator()
parquetSchema, err := pqschema.NewSchemaFromStruct(Test{})
if err != nil {
return nil, nil, err
}
schema, err := pqarrow.FromParquet(parquetSchema, &pqarrow.ArrowReadProperties{}, metadata.KeyValueMetadata{})
if err != nil {
return nil, nil, err
}
pqschema.PrintSchema(parquetSchema.Root(), os.Stdout, 2)
builder := array.NewRecordBuilder(pool, schema)
defer builder.Release()
for i, obj := range input {
builder.Field(0).(*array.BinaryBuilder).Append([]byte(obj.Id))
list := builder.Field(1).(*array.ListBuilder)
for _, value := range obj.Values[i] {
subList := list.ValueBuilder().(*array.ListBuilder)
subList.ValueBuilder().(*array.Float64Builder).Append(value)
subList.Append(true)
}
list.Append(true)
}
rec := builder.NewRecord()
This is fine for smaller structs but when they get larger or a lot more complicated it is very tedious writing out all of the builder code (if there is already a better way of doing this I would love to know! or if I am approaching this wrong, I am quite new to go :) )
Describe the enhancement requested
Hi,
I recently made a post on the mailing list but I thought this might make more sense as a location to communicate. I am using Arrow for Go mostly to read and write Parquet and IPC files. Often I would like to use the very helpful
schema.NewSchemaFromStruct()
fromgithub.com/apache/arrow/go/v11/parquet/schema
. However naturally then in my code, I would like to build an Arrow record using this schema, something like this:This is fine for smaller structs but when they get larger or a lot more complicated it is very tedious writing out all of the builder code (if there is already a better way of doing this I would love to know! or if I am approaching this wrong, I am quite new to go :) )
I thought it would make sense to have some reflection-based builder that can build a record from a struct. I took a stab at implementing something like this here: https://gist.github.com/gmintoco/3e65aa7b47ae37b0685db88b2755933f
My questions are:
Looking forward to any feedback :)
Component(s)
Go