aloneguid / parquet-dotnet

Fully managed Apache Parquet implementation
https://aloneguid.github.io/parquet-dotnet/
MIT License
636 stars 153 forks source link

[BUG]: cannot open file with column names ending in periods #278

Closed MCRE-BE closed 1 year ago

MCRE-BE commented 1 year ago

Library Version

unknown

OS

Windows

OS Architecture

32 bit

How to reproduce?

I am using an application (Parquet Viewer) that relies on parquet.net for reading parquet files. After I failed to open a file, I submitted a bug report there mukunku/ParquetViewer#70. The developer mentioned that the root cause was parquet.net as it seems to have an issue with column names ending in periods.

Is this easily fixable?

The original bug report with sample file :

Where was the parquet file created? Pandas Python - 1.5.3

Sample File Example.zip

Describe the bug Cannot open the file, but file can be opened in Python as the column flagged as missing is present. Likely because of a trailing "." (as that is the only mismatch between the file and the bugreport)

Screenshots If applicable, add screenshots to help explain your problem. image

Failing test

No response

aloneguid commented 1 year ago

Hey there, sorry to hear you're having trouble with those pesky dots in your path name. I know how annoying they can be. They're like little landmines waiting to explode your code.

Well, I have some good news and some sad news for you. The good news is that this is a known issue and it has been fixed in the upcoming 4.6.0 version.

The bad news is that I'm not the author of that util, so you'll have to take it upstream after the official release of 4.6.0, which should be sometime this week.

Or you could just avoid using dots in your path name altogether. That's up to you. Either way, I hope this helps and thanks for using my awesome software!

MCRE-BE commented 1 year ago

Top. Thanks ! I'll let the dev know that the fix is coming and that he doesn't need to make a workaround.

And yes : I do agree that we need to prevent the use of dots as much as possible, but sometimes my hands are tied to downstream code 🫣

You can close the issue if you want or keep it open until 4.6.0 release.

MCRE-BE commented 1 year ago

Dev of parquet-viewer confirmed that 4.6.0 fixes the problem. Thanks !