Closed jsn-m closed 8 months ago
All parquet options can be controlled as below
using (var r = new ChoParquetReader(@"test1.parquet")
.ParquetOptions(o => o.UseDeltaBinaryPackedEncoding = true))
{
}
I can't use LINQ because my ChoParquetWriter is declared dynamic because I needed to access the WithField method.
var genericParquetWriter = typeof(ChoParquetWriter<>).MakeGenericType(dtoType);
dynamic writerInstance = ChoActivator.CreateInstance(genericParquetWriter, new object[] { localPath, });
var dtoProperties = dtoType.GetAllProperties();
writerInstance.WithField(name: "SomeName", fieldType: typeof(string));
Figured it out:
var setParquetOptions = new Action<ParquetOptions>(s => s.UseDeltaBinaryPackedEncoding = false);
writerInstance.ParquetOptions(setParquetOptions);
As reported in the Parquet.Net Repo, the new DELTA_BINARY_PACKED encoding in Parquet.Net does not play well with Spark 3.3. Please expose the Parquet.Net UseDeltaBinaryPackedEncoding flag in ParquetOptions for setting.
https://aloneguid.github.io/parquet-dotnet/encodings.html#numbers