mukunku / ParquetViewer

Simple Windows desktop application for viewing & querying Apache Parquet files
GNU General Public License v3.0
687 stars 82 forks source link

[FIX] Proposal for fixing datetime query examples (#110) #111

Closed MCRE-BE closed 4 weeks ago

MCRE-BE commented 4 weeks ago

Wanted to try address #110 :) If it helps to save you some time.

I would also have changed the Wiki, but I can't easily fork it. I would have changed it to:

The utility allows users to run some simple SQL-like queries on the Parquet data. ## Query Syntax The syntax for queries can be found by clicking on the `Filter Query (?)` label: ![](https://user-images.githubusercontent.com/4502154/199867236-e733daac-62c3-42e5-a7ca-c7c4aebe23e0.png) The syntax is very similar to SQL except for how Dates are handled. Example query formats can be found below: | Type of Data | Example(s) | | ------------ | ---------- | | NULL Check | WHERE field_name IS NULL
WHERE field_name IS NOT NULL | | Datetime | WHERE field_name >= #2000/12/31# | | Numeric | WHERE field_name <= 123.4
WHERE field_name <> 10 | | String | WHERE field_name LIKE '%value%'
WHERE field_name = 'equals value'
WHERE field_name <> 'not equals' | | Using Multiple Conditions | WHERE (field_1 > #2000/12/31# AND field_1 < #2001/12/31#) OR field_2 <> 100 OR field_3 = 'string value' | | List/Map/Struct | WHERE my_list LIKE '%elem1%'
WHERE my_map = '(a,1)'
WHERE my_map LIKE '%key_or_value%'
WHERE my_struct LIKE '%field_or_value%' | Note: List, Map, and Struct fields are automatically cast to String type for querying. Note: The following datetime structures are accepted `yyyy/MM/dd` and the North American `MM/dd/yyyy`. (according to [stackoverflow ](https://stackoverflow.com/a/3584616/1458738) ### Escaping field names You will need to escape field names with square brackets if they contain spaces or punctuation: ``` WHERE [field with spaces and punctuation!] <> 'not equals' ``` ## Running the query The query can be entered in the Query Box located at the top of the UI: ![](https://github.com/mukunku/ParquetViewer/blob/master/wiki_images/querybox.png) To execute you may either hit Enter or click the Execute button. The grid below will be updated with your results. This can be verified by looking at the bottom-left side of the status bar which will show how many records have been filtered by the query: | Before Query | After Query | | ------------ | ----------- | | ![](https://github.com/mukunku/ParquetViewer/blob/master/wiki_images/beforequery.png) | ![](https://github.com/mukunku/ParquetViewer/blob/master/wiki_images/afterquery.png) | The Clear button will only remove the filter from the results in the grid below and will not clear the query text that you have entered. You may hit the Esc key while editing the query to quickly clear any existing query filters. | Before Clear | After Clear | | ------------ | ----------- | | ![](https://github.com/mukunku/ParquetViewer/blob/master/wiki_images/afterquery.png) | ![](https://github.com/mukunku/ParquetViewer/blob/master/wiki_images/beforequery.png) | It should be noted that queries will not run against the entire Parquet file but rather only the records that have been loaded into memory. For more information please see [Query Scope](https://github.com/mukunku/ParquetViewer/wiki/Running-Queries#query-scope). ## Query Scope Currently, queries that are run only apply to records that have been loaded into the application (First 1000 records by default). To run your queries against more records you must increase the Record Count so that more data from the Apache Parquet file is loaded into the application. Loading more data into the application requires more RAM so this might become troublesome for really large files. See [Tips For Large Files](https://github.com/mukunku/ParquetViewer/wiki/Tips-For-Large-Files) for some hints on how to deal with that.
mukunku commented 4 weeks ago

Thanks for the help. I'll make sure this gets into the next release. I also updated the wiki now.

MCRE-BE commented 4 weeks ago

No problem. I didn't know how I could test the changes though 🤷‍♂️