mukunku / ParquetViewer

Simple Windows desktop application for viewing & querying Apache Parquet files
GNU General Public License v3.0
773 stars 95 forks source link

The query doesn't not seem to be valid. Please try again cannot find column 'user_name' #103

Closed MMirabito closed 7 months ago

MMirabito commented 7 months ago

Parquet Viewer Version 2.10.1.1

Where was the parquet file created? Python

def saveAsParquet(): 
    print(f"Saving...")    
    print(f"Parquet format: [{PARQUET_FILE}]")    

    # Load the text file into a DataFrame  Assuming the delimiter is '|'
    df = pd.read_csv(MERGED_FILE, sep='|', encoding='latin1',dtype='str')
    df.to_parquet(PARQUET_FILE)

    print(f"Completed!")  

Sample File I cannot

Describe the bug When I try to add a where clause "The query doesn't not seem to be valid. Please try again cannot find column 'user_name'" However, I can query only using the first column DBUSID. What am I doing wrong? Screenshots 2024-03-26_20-31-54

Additional context

Any help is greatly appreciated.

Thanks, max

Note: This tool relies on the parquet-dotnet library for all the actual Parquet processing. So any issues where that library cannot process a parquet file will not be addressed by us. Please open a ticket on that library's repo to address such issues.

mukunku commented 7 months ago

Can you share your parquet file's metadata? You can hit Ctrl + M or go to Tools -> Metadata Viewer

MMirabito commented 7 months ago

HI @mukunku, I think I figured out what the issue is. It came to me as I was waking up.

My original file was formatted with the right padding with spaces to make the data set look easy to read if you open it in Notepad. This also included the column names. and when I removed the padding it started working.

2024-03-27_06-48-55

Thanks again for reaching out. BTW nice software very easy to use and run.

max

mukunku commented 7 months ago

Glad it's working for you and thanks for the feedback 👍🏼

Keep in mind you can query column names with spaces in them as well by wrapping the column name in brackets. Example: [user_name ] . This is mentioned in the wiki under Escaping Field Names

MMirabito commented 7 months ago

thank you for the tip

max