mukunku / ParquetViewer

Simple Windows desktop application for viewing & querying Apache Parquet files
GNU General Public License v3.0
689 stars 82 forks source link

Error opening .parquet file #8

Closed theroggy closed 4 years ago

theroggy commented 4 years ago

Larger .parquet files give ArgumentOutOfRangeException" error for me, just like mentioned in closed ticket #5.

For me, it isn't an urgent issue, as I transitioned to .sqlite files for some other technical reasons, but as it might be usefull for you or other users... but as you asked for example files in the now closed tickets.

The .parquet file was created using pyarrow in python, and I tested in ParquetViewer 1.1: https://drive.google.com/open?id=12vIw3f5tMURfzhI6hO3clHtPafPwhqnq

ParquetViewer_2019-07-24_09-27-55

jvasekpn commented 4 years ago

I am also experiencing this issue image

mukunku commented 4 years ago

The application is still using the old v1 Parquet.Net library which does not support many complicated parquet files. I will look into updating to the latest version but this will require a major re-write as the newer versions have changed dramatically.

mukunku commented 4 years ago

Please try the new beta release: https://github.com/mukunku/ParquetViewer/releases/tag/v2.0

I was able to load the above parquet file using this new version. Keep in mind that this is a beta release so all the features have not been fully tested.

jvasekpn commented 4 years ago

Thank you very much. I will definitely be testing it out on Monday.


From: Sal notifications@github.com Sent: Saturday, August 10, 2019 10:11:32 PM To: mukunku/ParquetViewer ParquetViewer@noreply.github.com Cc: Jonathan Vasek jvasek@pushnami.com; Comment comment@noreply.github.com Subject: Re: [mukunku/ParquetViewer] Error opening .parquet file (#8)

Please try the new beta release: https://github.com/mukunku/ParquetViewer/releases/tag/v2.0

I was able to load the above parquet file using this new version. Keep in mind that this is a beta release so all the features have not been fully tested.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/mukunku/ParquetViewer/issues/8?email_source=notifications&email_token=AKYEZ5BJRWD2ZV7FI46ZLBLQD57OJA5CNFSM4IGNCGE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4AZDGQ#issuecomment-520196506, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AKYEZ5FZ42CR73HEMNFYFX3QD57OJANCNFSM4IGNCGEQ.

jvasekpn commented 4 years ago

I am now getting a different error message after clicking done on this UI: image

image

If necessary I can provide a sample file for testing.

mukunku commented 4 years ago

Yes, Can you please provide a sample?

On Aug 12, 2019, at 10:07 AM, jvasekpn notifications@github.com<mailto:notifications@github.com> wrote:

I am now getting a different error message after clicking done on this UI: [image]https://user-images.githubusercontent.com/45108468/62871150-86191e00-bce0-11e9-93f9-3367a2a3784b.png

[image]https://user-images.githubusercontent.com/45108468/62871174-96c99400-bce0-11e9-8d34-c1af6828c7c4.png

If necessary I can provide a sample file for testing.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/mukunku/ParquetViewer/issues/8?email_source=notifications&email_token=ABCLFCRADVSJMKO3PLN6RGDQEFVCVA5CNFSM4IGNCGE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4CUPVY#issuecomment-520439767, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABCLFCS6CTNRBAMORZMPCF3QEFVCVANCNFSM4IGNCGEQ.

jvasekpn commented 4 years ago

Absolutely. I have generated a sample of random data using a library called "parquetjs". image

And here is the sample file in a ZIP archive: sample.zip

mukunku commented 4 years ago

Seems like Parquet.NET has issues opening these files. I took a look at ParquetJS and looks like the data types are not standard, at least as far as I can tell.

Unfortunately there isn't much I can do at this time unless the issue is resolved in the Parquet.Net library: https://github.com/elastacloud/parquet-dotnet

You can open a ticket with the Parquet.Net library if you'd like. Or if it's okay to share your sample file I can open the ticket for you.

jvasekpn commented 4 years ago

You may share the sample file. Thank you for looking into this for me! That sample file only contains random data.


From: Sal notifications@github.com Sent: Monday, August 12, 2019 6:09:19 PM To: mukunku/ParquetViewer ParquetViewer@noreply.github.com Cc: Jonathan Vasek jvasek@pushnami.com; Comment comment@noreply.github.com Subject: Re: [mukunku/ParquetViewer] Error opening .parquet file (#8)

Seems like Parquet.NET has issues opening these files. I took a look at ParquetJS and looks like the data types are not standard, at least as far as I can tell.

Unfortunately there isn't much I can do at this time unless the issue is resolved in the Parquet.Net library: https://github.com/elastacloud/parquet-dotnet

You can open a ticket with the Parquet.Net library if you'd like. Or if it's okay to share your sample file I can open the ticket for you.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/mukunku/ParquetViewer/issues/8?email_source=notifications&email_token=AKYEZ5G6CH4HKN6FGXWY66LQEHUR7A5CNFSM4IGNCGE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4ECLJI#issuecomment-520627621, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AKYEZ5GST6OKAZJTWY6P4PDQEHUR7ANCNFSM4IGNCGEQ.

mukunku commented 4 years ago

I created the following ticket: https://github.com/elastacloud/parquet-dotnet/issues/429 Let's see if someone can take a look.

mukunku commented 4 years ago

As replied by the parquet.net owner: https://github.com/elastacloud/parquet-dotnet/issues/429 It seems the parquet file is not valid as per even the native Java standards. So not much can be done on our side for this. I'm closing this ticket as fixed as the original issue has been fixed.

theroggy commented 4 years ago

Indeed the file I had issues with loads without problems now. Thanks!