aloneguid / parquet-dotnet

Fully managed Apache Parquet implementation
https://aloneguid.github.io/parquet-dotnet/
MIT License
542 stars 141 forks source link

[BUG]: Fail to serialize enum #442

Closed pantonis closed 5 months ago

pantonis commented 6 months ago

Library Version

4.17.0

OS

Windows 10

OS Architecture

64 bit

How to reproduce?

For some reason it seems that enum fail to be serialized.

 internal class ParquetEnumTest
 {
     internal async Task ClientTest()
     {
         List<Client> clients = new();
         for (int i = 0; i < 10; i++)
         {
             clients.Add(new Client
             {
                 Id = i,
                 Name = $"Client {i}",
                 ClientStatus = (ClientStatus)(i % 4 + 1)
             });
         }

         using (var memStream = new MemoryStream())
         {
             await ParquetSerializer.SerializeAsync(clients , memStream);
             memStream.Seek(0, SeekOrigin.Begin);

             var parquetBytes = memStream.ToArray();
         }
     }
 }

 internal class Client
 {
     public int Id { get; set; }
     public string Name { get; set; }
     public ClientStatus ClientStatus { get; set; }
 }

 public enum ClientStatus
 {
     Active = 1,
     Inactive = 2,
     Blocked = 3,
     Archived = 4
 }

and Im getting the following exception

property 'ClientStatus' has no fields

   at Parquet.Serialization.TypeExtensions.MakeField(Type t, String columnName, String propertyName, PropertyInfo pi, Boolean forWriting)
   at Parquet.Serialization.TypeExtensions.MakeField(PropertyInfo pi, Boolean forWriting)
   at Parquet.Serialization.TypeExtensions.<>c__DisplayClass13_0.<CreateSchema>b__0(PropertyInfo p)
   at System.Linq.Enumerable.SelectListIterator`2.MoveNext()
   at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.ToArray()
   at System.Linq.Buffer`1..ctor(IEnumerable`1 source)
   at System.Linq.OrderedEnumerable`1.ToList()
   at Parquet.Serialization.TypeExtensions.CreateSchema(Type t, Boolean forWriting)
   at Parquet.Serialization.TypeExtensions.GetParquetSchema(Type t, Boolean forWriting)
   at Parquet.Serialization.ParquetSerializer.<>c__3`1.<SerializeAsync>b__3_0(Type _)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at Parquet.Serialization.ParquetSerializer.<SerializeAsync>d__3`1.MoveNext()

Failing test

No response

aloneguid commented 6 months ago

Enums are not supported yet, see #302 as this is in progress. As you are working with enums, can I ask how you are expecting it to be written, as a number, as a string etc.?

pantonis commented 6 months ago

thanks for the clarification. Expect it to as fast as possible (number). Any estimate on when this is gonna be completed? Also saving/restoring as string can be tricky as sometime you can rename the enum name but the value remains the same. If we do that deserializing the old value won't work

aloneguid commented 5 months ago

Thank you. Closing as duplicate of #302 for now.