rcongiu / Hive-JSON-Serde

Read - Write JSON SerDe for Apache Hive.
Other
733 stars 391 forks source link

Extra Line Feed in Json File creates a extra row in Hive ( and count is incorrect ) #207

Open bjaggi opened 6 years ago

bjaggi commented 6 years ago

Hello, i am using your serde for nested json mapping and works great.

We have a scenario where we have 2 lines feeds as delimiter ( Seems like hive only supports one \n, one more reason to go with a custom serde).

Same Input File :

{ "id": "1",

"id":"2"
}

when i do select from hive_table or do count() hive is including a extra line feed. Expected output is 2 but hive shows count as 3.

I tried to change some code in this file

Link_To_JSONObject.java_Line318

New Logic : split text based on delimiter \n and then remove lines which are empty after trim. Works fine on the test case, but not when i use in Hive. Any suggestions ?

rcongiu commented 4 years ago

Mmm, do you have the complete json you're using ? Like an actual file ? The one you posted should not work at all since the serde only supports one json record per line without \n