rcongiu / Hive-JSON-Serde

Read - Write JSON SerDe for Apache Hive.
Other
733 stars 393 forks source link

escape characters handling #43

Closed ghost closed 11 years ago

ghost commented 11 years ago

I noticed JSON serde translates (deserializes) escaped characters back to the original value, for example \n will be translated to newline instead of kept as simple string "\n".

Is there an option to suppress this functionality and keep the string as is?

thank you

rcongiu commented 11 years ago

Hi...ehm...Jackass-io ?, once the string is deserialized, there's no way to know which representation it had before serialization, but if the deserialized string has unescaped \n it's definitely a bug of the json library that is included. I was planning to replace it with jackson at some point, this may be another point in favor of that change.

R.

 

"Good judgment comes from experience.

Experience comes from bad judgment"

Data Engineer - OpenX.org Pasadena, CA Skype: sardodazione Y! IM: rcongiu

On Tuesday, November 12, 2013 4:04 PM, Jackass-io notifications@github.com wrote:

I noticed JSON serde translates (deserializes) escaped characters back to the original value, for example \n will be translated to newline instead of kept as simple string "\n".

Is there an option to suppress this functionality and keep the string as is? When I ran a test and included all characters between 01-31 e.g. {"mytest":"\u0001\u0002\u0003\u0004\u0005\u0006\u0007\b\t\n\u000b\f\r\u000e\u000f\u0010\u0011\u0012\u0013\u0014\u0015\u0016\u0017\u0018\u0019\u001a\u001b\u001c\u001d\u001e\u001f "} as a field using JSON Serde, I was not able to parse this string at all and functions like length returned incorrect result thank you — Reply to this email directly or view it on GitHub.

ghost commented 11 years ago

Thank you for quick response!

It would be certainly nice option not to deserilize escape characters and leave them as-is (although I think some might argue otherwise),

p.s. jackass.io was still available, so I went for it.... seems like it will be chasing me for years to come ;-)

RIHABOUR commented 4 years ago

Hi...ehm...Jackass-io ?, once the string is deserialized, there's no way to know which representation it had before serialization, but if the deserialized string has unescaped \n it's definitely a bug of the json library that is included. I was planning to replace it with jackson at some point, this may be another point in favor of that change.

R.

 

"Good judgment comes from experience.

Experience comes from bad judgment"

Data Engineer - OpenX.org Pasadena, CA Skype: sardodazione Y! IM: rcongiu

On Tuesday, November 12, 2013 4:04 PM, Jackass-io notifications@github.com wrote:

I noticed JSON serde translates (deserializes) escaped characters back to the original value, for example \n will be translated to newline instead of kept as simple string "\n".

Is there an option to suppress this functionality and keep the string as is? When I ran a test and included all characters between 01-31 e.g. {"mytest":"\u0001\u0002\u0003\u0004\u0005\u0006\u0007\b\t\n\u000b\f\r\u000e\u000f\u0010\u0011\u0012\u0013\u0014\u0015\u0016\u0017\u0018\u0019\u001a\u001b\u001c\u001d\u001e\u001f "} as a field using JSON Serde, I was not able to parse this string at all and functions like length returned incorrect result thank you — Reply to this email directly or view it on GitHub.

Hi can you explaine the solution Jackass-io, or how can i not to deserilize escape characters and leave them as-is. I have the same issues and i would be grateful for your help PS: i know that the issues is closed but i didn't want to open a new issues in the same subject