Closed vladiliescu closed 3 years ago
Does the plugin actually crash? I'd expect it to just display an error message and then return to its starting state.
This is because dots are invalid characters in Avro and this plugin uses org.apache.parquet:parquet-avro to read Parquet files. See https://avro.apache.org/docs/current/spec.html#names for the full naming rules (which includes an explanation as to why dots aren't used). I tend to stick to underscores as separators.
Not sure what counts as a crash to be honest, but I do get an Error
prompt plus an IDE Fatal Errors
bubble prompt with a stack trace and everything (included below)
Unable to process file /<edited>/data_all.parquet
org.apache.avro.SchemaParseException: Illegal character in: temperature.value
at org.apache.avro.Schema.validateName(Schema.java:1566)
at org.apache.avro.Schema.access$400(Schema.java:91)
at org.apache.avro.Schema$Field.<init>(Schema.java:546)
at org.apache.avro.Schema$Field.<init>(Schema.java:585)
at org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:280)
at org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:264)
at org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:134)
at org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:185)
at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:156)
at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135)
at uk.co.hadoopathome.intellij.viewer.fileformat.ParquetFileReader.getRecords(ParquetFileReader.java:99)
at uk.co.hadoopathome.intellij.viewer.FileViewerToolWindow$2.doInBackground(FileViewerToolWindow.java:193)
at uk.co.hadoopathome.intellij.viewer.FileViewerToolWindow$2.doInBackground(FileViewerToolWindow.java:184)
at java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:304)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:343)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Yup, another workaround is to simply remove the dots from the files, just that I didn't expect this error to occur since this seems to be a valid parquet file.
Thanks for the error. I'm afraid I can't fix this as it's inside the library that the plugin uses to parse the files. I'll look into making the errors more palatable.
Loading a parquet file with columns such as
temperature.value
will crash the plugin.