rcongiu / Hive-JSON-Serde

Read - Write JSON SerDe for Apache Hive.
Other
733 stars 391 forks source link

Failure when the string is empty #193

Closed codesolace closed 4 years ago

codesolace commented 7 years ago

When the string is empty like "" then it fails with org.openx.data.jsonserde.json.JSONObject cannot be cast to org.openx.data.jsonserde.json.JSONArray . Why do you return null in getJSONArray method when its an empty string ,if you return new JSONArray() it will not fail like this.

rcongiu commented 7 years ago

This does not make any sense, if the string is empty, the field is a string, not an array. Anyway, I can have a look at it if you submit a reproducible example of the issue (code and data). You can have a look at http://www.panopticdev.com/blog/7-best-practices-bug-reporting/ for suggestions on how to report bugs effectively

codesolace commented 7 years ago

Hi @rcongiu ,

Attached are the steps to reproduce the issue,it only happens when a map reduce is spawned.

failure_scenario.txt

PS : My first comment regarding the issue appears strong ,apologies for that ,it was not intended to be.

rcongiu commented 7 years ago

Well, I don't think is a problem in the SerDe, in your example the field test is declared as an array of strings, but is sometimes just a string. Letting it assume that a string is a one element array may potentially break other users that (rightfully) expect the system to fail on incorrectly parsed data. I could make it an option that has to be turned on in the serde, but it's turned off by default.

{"dt":1,"first":{"second":[{"test":["a"],"pest":"p"}]}}
{"dt":1,"first":{"second":[{"test":"b","pest":"p"}]}}
{"dt":2,"first":{"second":[{"test":"","pest":"p"}]}}
rcongiu commented 4 years ago

Closing old issue