Closed dkotov closed 3 days ago
Not reproducible, here is the test proving it:
class EdgeCaseInt32 {
public int Id { get; set; }
}
[Fact]
public async Task EdgeCase_rawint64_to_classInt32() {
var schema = new ParquetSchema(new DataField<long>("Id"));
using var ms = new MemoryStream();
using(ParquetWriter writer = await ParquetWriter.CreateAsync(schema, ms)) {
using(ParquetRowGroupWriter rg = writer.CreateRowGroup()) {
await rg.WriteColumnAsync(new DataColumn(schema.DataFields[0], new long[] { 1, 2, 3 }));
}
}
ms.Position = 0;
IList<EdgeCaseInt32> data = await ParquetSerializer.DeserializeAsync<EdgeCaseInt32>(ms);
Assert.Equal(1, data[0].Id);
Assert.Equal(2, data[1].Id);
Assert.Equal(3, data[2].Id);
}
Feel free to reopen with reproducible test if I didn't understand you correctly.
You are right, I did miss one detail: to reproduce this issue one needs a file with the following schema (without logical type/hint):
message spark_schema {
optional int64 Id;
}
If I get it correctly, the provided test actually replicates the following schema (with logical type/hint):
message root {
required int64 Id (INTEGER(64,true));
}
That's why it doesn't reproduce the issue. Unfortunately, I'm not sure how to simulate the first schema with Parquet.Net code - need your help here.
But I attached a sample file with the following content:
parquet-tools cat ./no-logical-type.parquet
# output
[{"Id":1},{"Id":2},{"Id":3}]
And here is an integration test for it:
class EdgeCaseInt32 {
public int? Id { get; set; }
}
[Fact]
public async Task EdgeCase_fileInt64_to_classInt32() {
IList<EdgeCaseInt32> data = await ParquetSerializer.DeserializeAsync<EdgeCaseInt32>("no-logical-type.parquet");
Assert.Equal(1, data[0].Id); // Actual: 1 - SUCCESS
Assert.Equal(2, data[1].Id); // Actual: 0 - FAILURE
Assert.Equal(3, data[2].Id); // Actual: 2 - FAILURE
}
@aloneguid please reopen the issue on my behalf as I don't have required permissions to do it. thanks!
Library Version
5.0.2
OS
Ubuntu Linux 22.04
OS Architecture
64 bit
How to reproduce?
Result: de-serialization completes without errors/warnings but values in objects don't match values in the file, e.g.
3
instead of1
.Failing test
No response