Open exalate-issue-sync[bot] opened 1 year ago
Dave Finnegan commented: Additional info:
The same avro file is recognized when an importFile is run from a standalone H2O Instance (non-hadoop invocation).
Also, the source for this avro file is the airlines allyears.1987.2013.csv taken from s3. The final two cols have values of 'NO', or 'YES'. However, the avro version of the file contains 'YES', 'NO', and 'NOS'. There are just a couple of 'NO' values and most were changd to 'NOS'. The avro file was created via hive be importing the csv file, then creating an avro table and inserting the text table content into the avro table.
JIRA Issue Migration Info
Jira Issue: PUBDEV-5205 Assignee: New H2O Bugs Reporter: Dave Finnegan State: Open Fix Version: N/A Attachments: Available (Count: 1) Development PRs: N/A
Attachments From Jira
Attachment Name: avro_parse_error Attached By: Dave Finnegan File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-5205/avro_parse_error
When running fileImport from Flow and attempting to read an Avro file the Parse Configuration cell defaults to CSV Parser, SOH Separator and 3 cols named Obj, aavro.schema, and and fails to parse the file after setting the Parser type to Avro, Separator to Auto, and Column Headres to Auto.
The error is:
Error evaluating cell Error calling POST /3/Parse with opts ["destination-frame":"X000000_01.hex","... ERROR MESSAGE: given val type is not supported: java.lang.NoSuchMethodError
Flow UI log attached
Avro file was created with Hive by creating an avro table and then inserting data from a text table as follows:
Create Avro table
hive> create table if not exists allyears2k_avro ( Year int, Month int, DayofMonth int, DayOfWeek int, DepTime int, CRSDepTime int, ArrTime int, CRSArrTime int, UniqueCarrier String, FlightNum int, TailNum String, ActualElapsedTime int, CRSElapsedTime int, AirTime String, ArrDelay int, DepDelay int, Origin String, Dest String, Distance int, TaxiIn String, TaxiOut String, Cancelled int, CancellationCode String, Diverted int, CarrierDelay String, WeatherDelay String, NASDelay String, SecurityDelay String, LateAircraftDelay String, IsArrDelayed String, IsDepDelayed String) row format delimited fields terminated by ',' lines terminated by '\n' stored as avro location "/user/dave/data/data_format_testing/allyears2k_avro" ;
insert data from text table into avro table
hive> insert overwrite table allyears2k_avro select * from allyears2k_txt;