microsoft / AzureStorageExplorer

Easily manage the contents of your storage account with Azure Storage Explorer. Upload, download, and manage blobs, files, queues, tables, and Cosmos DB entities. Gain easy access to manage your virtual machine disks. Work with either Azure Resource Manager or classic storage accounts, plus manage and configure cross-origin resource sharing (CORS) rules.
Creative Commons Attribution 4.0 International
377 stars 86 forks source link

Preview of parquet files returning 'no data' #7805

Closed ke-vdv closed 7 months ago

ke-vdv commented 8 months ago

Preflight Checklist

Storage Explorer Version

1.33.0

Regression From

No response

Architecture

arm64

Storage Explorer Build Number

20240301.4

Platform

macOS

OS Version

Sonoma 14.2.1

Bug Description

Preview parquet file is returning 'No data' and is returning an activity error with the following details:

{
  "name": "Error",
  "message": "Unable to preview 'X_29_7-1.parquet'.",
  "stack": "Error: Unable to preview 'X_29_7-1.parquet'.\n    at fetchParquetData (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:102:2408)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async FilePreview.fetchTabularData (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/node_modules/@storage-explorer/file-preview/dist/src/index.js:102:4366)\n    at async Je.executeOperation (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/out/app/node/NodeProcessHostProxy.js:3:2742)\n    at async Bt._handleExecuteRequest (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/out/app/node/NodeProcessHostProxy.js:7:915)\n    at async Bt._handleMessage (/Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/out/app/node/NodeProcessHostProxy.js:6:30033)\n    at async /Applications/Microsoft Azure Storage Explorer.app/Contents/Resources/app/out/app/node/NodeProcessHostProxy.js:6:27544"
}

Steps to Reproduce

  1. Launch Storage Explorer
  2. Expand a blob container node
  3. Navigate to a parquet file (which was created by a copy task from a json file to a parquet file within a Synapse Analytics Workspace)
  4. Open context menu
  5. Choose 'Preview'

Actual Experience

The Preview window is showing 'No data'

Expected Experience

The Preview window should return a column-based parquet file output

Additional Context

The file size is +/- 2 MB. Files with a size of 100 KiB can be opened, though. Is there any correlation with the hardware specs?

JasonYeMSFT commented 8 months ago

The error indicates there is some parsing error when the parquet.js library attempts to parse your parquet file. 2MB is a reasonable size. It's possible that the 100 KiB file doesn't use unsupported field or encoding that the 2MB file uses. Would you like to try parsing the parquet file using the library and see what error it reports? If it reports the error, you can open a feature request in their repository.

They have instructions on how to read an existing parquet file. https://github.com/LibertyDSNP/parquetjs?tab=readme-ov-file#usage-reading-files

MRayermannMSFT commented 8 months ago

Alternatively, @ke-vdv, if you are unable to follow the instructions at the link, if you can share a parquet file that doesn't work, we can try it out ourselves. Thanks!

JasonYeMSFT commented 7 months ago

We fixed a similar issue for 1.34.0. https://github.com/microsoft/AzureStorageExplorer/issues/7807 If the parquet file you couldn't open contains a Decimal field, it is very likely to be due to the same issue.

MRayermannMSFT commented 7 months ago

Closing due to lack of response. If you still require help, we recommend you open an Azure support ticket via the portal. Alternatively you can open a new issue here. This issue will no longer be monitored.