mukunku / ParquetViewer

Simple Windows desktop application for viewing & querying Apache Parquet files
GNU General Public License v3.0
689 stars 82 forks source link

[FEATURE-REQUEST] Ability to open partitioned files #35

Closed tigerhawkvok closed 1 year ago

tigerhawkvok commented 3 years ago

Parquet Viewer Version

2.3.1.41849

Where was the parquet file created?

Pandas -> pyarrow

dfStore.to_parquet(BUILDINGS_OUTPUT_FILE, partition_cols= ["type"])

Sample File

pv_bugdemo.parquet.zip

Describe the bug

A partitioned file that is actually a folder with several subfiles should be supported. This probably involves checking if the "file" is actually a directory then traversing the tree to read the individual constituent files.

Note: This tool relies on the parquet-dotnet library for all the actual Parquet processing. So any issues where that library cannot process a parquet file will not be addressed by us. Please open a ticket on that library's repo to address such issues.

mukunku commented 3 years ago

This is a big feature to implement. Not sure if it can be done as it's really complicated. Leaving this ticket open for now in case anyone wants to tackle this behemoth.

For the time being, you'll have to open the files one by one. Or make sure you save them as a single file instead of a partitioned one.

mukunku commented 1 year ago

Support to open partitioned files has been finally added! https://github.com/mukunku/ParquetViewer/releases/tag/v2.6.0.2

AFgh24 commented 1 year ago

I could not open the sample files that are the first post with the latest version

After loading, the program closes completely