benwatson528 / intellij-avro-parquet-plugin

A Tool Window plugin for IntelliJ that displays Avro and Parquet files and their schemas in JSON.
Apache License 2.0
43 stars 9 forks source link

Unable to process file #25

Closed adrianignat13 closed 4 years ago

adrianignat13 commented 4 years ago

I am trying to view files dumped by Azure Capture from Azure EventHubs to Azure BlobStorage and I get the following error:

2020-05-22 14:13:10,761 [246031109] ERROR - ij.viewer.FileViewerToolWindow - IntelliJ IDEA 2020.1.1 Build #IC-201.7223.91 2020-05-22 14:13:10,761 [246031109] ERROR - ij.viewer.FileViewerToolWindow - JDK: 11.0.6; VM: OpenJDK 64-Bit Server VM; Vendor: JetBrains s.r.o 2020-05-22 14:13:10,761 [246031109] ERROR - ij.viewer.FileViewerToolWindow - OS: Windows 10 2020-05-22 14:13:10,761 [246031109] ERROR - ij.viewer.FileViewerToolWindow - Plugin to blame: Avro and Parquet Viewer version: 1.1.2 2020-05-22 14:14:04,793 [246085141] INFO - ij.viewer.FileViewerToolWindow - Received file D:\downloads\13_02_45_29.avro 2020-05-22 14:14:04,803 [246085151] INFO - ewer.fileformat.AvroFileReader - Retrieved 347 records 2020-05-22 14:14:04,842 [246085190] INFO - ij.viewer.table.TableFormatter - Found 6 unique columns 2020-05-22 14:14:24,678 [246105026] ERROR - ij.viewer.FileViewerToolWindow - Unable to process file java.lang.UnsupportedOperationException: JsonObject at com.google.gson.JsonElement.getAsString(JsonElement.java:179) at uk.co.hadoopathome.intellij.viewer.table.TableFormatter.getRows(TableFormatter.java:44) at uk.co.hadoopathome.intellij.viewer.table.JTableHandler.updateTable(JTableHandler.java:30) at uk.co.hadoopathome.intellij.viewer.FileViewerToolWindow$2.doInBackground(FileViewerToolWindow.java:178) at uk.co.hadoopathome.intellij.viewer.FileViewerToolWindow$2.doInBackground(FileViewerToolWindow.java:168) at java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:304) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:343) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834)

benwatson528 commented 4 years ago

Hi Adrian,

Do you have a sample file that you could share so I can recreate this error?

Ben

On Fri, 22 May 2020, 14:19 Adrian Ignat, notifications@github.com wrote:

I am trying to view files dumped by Azure Capture from Azure EventHubs to Azure BlobStorage and I get the following error:

2020-05-22 14:13:10,761 [246031109] ERROR - ij.viewer.FileViewerToolWindow

  • IntelliJ IDEA 2020.1.1 Build #IC-201.7223.91 2020-05-22 14:13:10,761 [246031109] ERROR - ij.viewer.FileViewerToolWindow
  • JDK: 11.0.6; VM: OpenJDK 64-Bit Server VM; Vendor: JetBrains s.r.o 2020-05-22 14:13:10,761 [246031109] ERROR - ij.viewer.FileViewerToolWindow
  • OS: Windows 10 2020-05-22 14:13:10,761 [246031109] ERROR - ij.viewer.FileViewerToolWindow
  • Plugin to blame: Avro and Parquet Viewer version: 1.1.2 2020-05-22 14:14:04,793 [246085141] INFO - ij.viewer.FileViewerToolWindow
  • Received file D:\downloads\13_02_45_29.avro 2020-05-22 14:14:04,803 [246085151] INFO - ewer.fileformat.AvroFileReader
  • Retrieved 347 records 2020-05-22 14:14:04,842 [246085190] INFO - ij.viewer.table.TableFormatter
  • Found 6 unique columns 2020-05-22 14:14:24,678 [246105026] ERROR - ij.viewer.FileViewerToolWindow
  • Unable to process file java.lang.UnsupportedOperationException: JsonObject at com.google.gson.JsonElement.getAsString(JsonElement.java:179) at uk.co.hadoopathome.intellij.viewer.table.TableFormatter.getRows(TableFormatter.java:44) at uk.co.hadoopathome.intellij.viewer.table.JTableHandler.updateTable(JTableHandler.java:30) at uk.co.hadoopathome.intellij.viewer.FileViewerToolWindow$2.doInBackground(FileViewerToolWindow.java:178) at uk.co.hadoopathome.intellij.viewer.FileViewerToolWindow$2.doInBackground(FileViewerToolWindow.java:168) at java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:304) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:343) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/benwatson528/intellij-avro-parquet-plugin/issues/25, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPNI2L3JVKF5ID6UONE5XDRSZ3VLANCNFSM4NHY4FAQ .

adrianignat13 commented 4 years ago

The file contains sensitive information and I don't know how to obfuscate it. Weird enough, I have another, smaller file, dumped also by Azure Capture that the plugin can open. The file that the plugin cannot open is ~60kb, the smaller one is 3kb.

benwatson528 commented 4 years ago

I'm not able to recreate this issue myself, but I've built a version of the plugin that will either fix your issue or print:

LOGGER.error(
    String.format("Caught JsonObject issue for row [%s], value [%s] and col [%s]",
    flattenedRecordCopy, valueCopy, colCopy));

If you could please install the zip version of the plugin from here (Plugins -> click the cog -> Install Plugin from Disk...) and let me know what happens when you try with the broken file?

adrianignat13 commented 4 years ago

That fixes the issue. I am impressed with the quick response and the quick fix, without a valid file to test with. Kudos to you kind sir.

benwatson528 commented 4 years ago

Great news, thanks for confirming. I'll do an official release with the fix in this weekend.

benwatson528 commented 4 years ago

@adrianignat13 the update is live on the marketplace now.