aloneguid / parquet-dotnet

Fully managed Apache Parquet implementation
https://aloneguid.github.io/parquet-dotnet/
MIT License
542 stars 141 forks source link

Migrating from v3 to v4 question... #425

Closed sgentry closed 8 months ago

sgentry commented 8 months ago

Issue description

Hi,

Wondering how to migrate the below code to be version 4 compatible. It looks like the approach now is to define the schema first and then reference the schema field when declaring the DataColumn. Hoping to minimize what I need to refactor as we have lots of models with many columns using below pattern. Any advise would be greatly appreciated! 

The runtime error for below code is: You need to construct a schema passing in this field first. (Parameter 'field')

var columns = new List<DataColumn>
{
    {
        new DataColumn(new DataField<string>("column_a"),
            grouping.Select(x => x.ColumnA).ToArray()
        )
    },
    {
        new DataColumn(new DataField<int>("column_b"),
            grouping.Select(x => x.ColumnB).ToArray()
        )
    }
};

var schema = new Schema(columns.Select(x => x.Field).ToList());
aloneguid commented 8 months ago

I suspect you'll have to declare schemas separately from data. It's hard to guess without seeing the entire codebase.