Open ihenry opened 5 months ago
@ihenry Can you share your Parquet file or a small sample file that we could test with?
Thanks @craxal. I havew emailed a sample parquet file to the sehelp mailbox.
Issue reproduced on our end.
Are all of your decimal values intentionally 0? Every buffer that's parsed seems to contain only zeroes.
It seems that the library we use does not currently support decimal values (see https://github.com/LibertyDSNP/parquetjs#list-of-supported-types--encodings). We might be able to work around this by parsing the buffer ourselves.
Yes, that extract was from a system that contains sample data. It appears the default value is zero. I have seen the same behaviour with non-zero decimal values too, that was real data which is more difficult to share.
@ihenry I'm having trouble producing a Parquet file with non-zero decimal values encoded as byte arrays. Can you either provide another sample file or guidance as to how you produced the sample you emailed earlier?
@craxal, I have emailed a new sample file to the sehelp email address. I am using SAP Datasphere to extract the data from an SAP system and land that in Microsoft Azure Data Lake Storage Gen2 target.
Thank you very much.
Unfortunately, it does not look like parsing a buffer array into a displayable decimal ourselves is as trivial as I had hoped. If each element in the array had represented a base-10 digit, this would have been pretty straightforward, but it doesn't appear to be the case. Attempting to decode these values correctly makes me nervous and seems wasteful when the library already has support lined up (LibertyDSNP/parquetjs#91).
We'll keep this work item open for tracking, but we'll need to wait for library support.
Preflight Checklist
Storage Explorer Version
1.33.1
Regression From
No response
Architecture
x64
Storage Explorer Build Number
20240410.2
Platform
All
OS Version
Windows 11 & MacOS 14.5
Bug Description
Incorrect preview of parquet files with multiple decimal precision (5,3), (9,5) and (38,6).
Steps to Reproduce
Previewing a parquet file in Azure Storage Explorer containing columns defined with various decimal precision (5,3), (9,5) and Decimal (38,6) shows incorrect results. The file should be previewed as in DBeaver with 0 or 0.xxx as appropriate. DBeaver with DuckDB shows the following preview
DBeaver with DuckDB Metadata
Azure Storage Explorer 1.33.1
Actual Experience
Expecting to see raw values, but we actually see {"type":"Buffer","data":[0,0,0,0]}