aloneguid / parquet-dotnet

Fully managed Apache Parquet implementation
https://aloneguid.github.io/parquet-dotnet/
MIT License
542 stars 140 forks source link

[BUG]: Floor does not handle Dictionary<string, DateTime> #525

Open Pragmateek opened 3 days ago

Pragmateek commented 3 days ago

Library Version

4.24.0

OS

Windows 11

OS Architecture

64 bit

How to reproduce?

Serialize an instance of a class containing a Dictionary<string, DateTime> to Parquet using Parquet.Net 4.25.0-pre.2:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using Parquet.Serialization;
using NFluent;
using NUnit.Framework;

public class TestDico
{
    public Dictionary<string, DateTime> Values { get; set; }
}

[TestFixture]
public class ParquetSerializerTest
{
    [Test]
    public async Task CanSerializeADictionary()
    {
        var filePath = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.parquet");

        var inputDictionary = new TestDico { Values = new Dictionary<string, DateTime> { { "Now", DateTime.UtcNow } } };
        await ParquetSerializer.SerializeAsync(new[] { inputDictionary }, filePath);

        var outputDictionary = (await ParquetSerializer.DeserializeAsync<TestDico>(filePath)).Single();

        Check.That(outputDictionary.Values.SequenceEqual(inputDictionary.Values)).Is(true);
    }
}

The roundtrip test passes. But Floor 4.24.0 cannot open the Parquet file: image

System.FieldAccessException: failed to compile 'Values/key_value/Value (System.DateTime)'
 ---> System.InvalidOperationException: The binary operator Equal is not defined for the types 'System.DateTime' and 'System.Object'.
   at System.Linq.Expressions.Expression.GetEqualityComparisonOperator(ExpressionType, String, Expression, Expression, Boolean)
   at System.Linq.Expressions.Expression.Equal(Expression, Expression, Boolean, MethodInfo )
   at System.Linq.Expressions.Expression.Equal(Expression, Expression)
   at Parquet.Serialization.Dremel.FieldAssemblerCompiler`1.GetClassMember(Type rootType, Expression rootVar, Field parentField, Field field, String name)
   at Parquet.Serialization.Dremel.FieldAssemblerCompiler`1.InjectLevel(Expression rootVar, Type rootType, Field parentField, Field[] levelFields, List`1 path)
   at Parquet.Serialization.Dremel.FieldAssemblerCompiler`1.InjectLevel(Expression rootVar, Type rootType, Field parentField, Field[] levelFields, List`1 path)
   at Parquet.Serialization.Dremel.FieldAssemblerCompiler`1.InjectColumn()
   at Parquet.Serialization.Dremel.FieldAssemblerCompiler`1.Compile()
   at Parquet.Serialization.Dremel.Assembler`1.Compile(ParquetSchema schema, DataField df)
   --- End of inner exception stack trace ---
   at Parquet.Serialization.Dremel.Assembler`1.Compile(ParquetSchema schema, DataField df)
   at Parquet.Serialization.Dremel.Assembler`1.<>c__DisplayClass0_0.<.ctor>b__0(DataField df)
   at System.Linq.Enumerable.SelectArrayIterator`2.Fill(ReadOnlySpan`1, Span`1, Func`2)
   at System.Linq.Enumerable.SelectArrayIterator`2.ToList()
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1)
   at Parquet.Serialization.Dremel.Assembler`1..ctor(ParquetSchema schema)
   at Parquet.Serialization.ParquetSerializer.<>c__DisplayClass23_0.<GetAssembler>b__0(ParquetSchema _)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2)
   at Parquet.Serialization.ParquetSerializer.GetAssembler(ParquetSchema schema)
   at Parquet.Serialization.ParquetSerializer.DeserializeAsync(Stream source, ParquetOptions options, CancellationToken cancellationToken)
   at Parquet.Floor.ViewModels.DataViewModel.InitReaderAsync(FileViewModel file, Stream fileStream)

Failing test

No response