aloneguid / parquet-dotnet

Fully managed Apache Parquet implementation
https://aloneguid.github.io/parquet-dotnet/
MIT License
542 stars 140 forks source link

[BUG]: ParquetSerializer.DeserializeAllAsync internally causes "Sequence contains no elements" error when empty(or bad) file is provided #473

Open zoran123456 opened 5 months ago

zoran123456 commented 5 months ago

Library Version

4.23.4

OS

Window

OS Architecture

64 bit

How to reproduce?

  1. ....write this file, and provide it with empty parquet file:
using var stream = new MemoryStream(parquetFileContents);

await foreach (var record in ParquetSerializer.DeserializeAllAsync<MyPocoClass>(stream))
{
    yield return record;
}

Failing test

No response

zoran123456 commented 5 months ago

I tested on a large quantity of files, and in one specific file it breaks. I'm not sure if Parquet is invalid or just empty, but the code should not throw unhandled exception.

When using alternative (synchronous approach), like this: ParquetSerializer.DeserializeAsync<MyPocoClass>(stream).GetAwaiter().GetResult(); the code didn't break, it returned no elements.