embulk / embulk-base-restclient

Base class library for Embulk plugins to access RESTful services
https://www.embulk.org/
Apache License 2.0
6 stars 7 forks source link

Handle type errors in (Jackson)ServiceValue and *ColumnWriters #12

Closed dmikurube closed 7 years ago

dmikurube commented 7 years ago

@muga Some questions in the original util-web_api.

1) Some technical reasons why *value() is used in Boolean, Double and Long / asText() is used for String and Timestamp / toString() is used for Json? 1-1) Broken down: asText v.s. toString, and how about unused textValue? 1-2) Broken down: how about asBoolean, asDouble and asLong?

2) In Json, special reasons to once convert into String and then parse that. Not to directly convert JsonNode to msgpack Value?

muga commented 7 years ago
Some technical reasons why *value() is used in Boolean, Double and Long / asText() is used for String and Timestamp / toString() is used for Json?

@dmikurube Good catch. It's just bad implementation in embulk-util-web_api. I didn't care of them. I don't check the source code of Jackson for now but, it sounds that the usage of asXXX methods is better than others?

In Json, special reasons to once convert into String and then parse that. Not to directly convert JsonNode to msgpack Value?

Do you have good idea to convert them directly? I didn't know about it at that time. We should take it if you have.

dmikurube commented 7 years ago

@dmikurube Good catch. It's just bad implementation in embulk-util-web_api. I didn't care of them. I don't check the source code of Jackson for now but, it sounds that the usage of asXXX methods is better than others?

Not perfectly sure. In general, as* tries to convert the value in some way within Jackson. *value just returns its real representation. Then,

To make it easy to track later problem investigation, it may be better to control type conversions by ourselves, but it's not a trivial decision.

Do you have good idea to convert them directly? I didn't know about it at that time. We should take it if you have.

No special ideas at the moment. I wanted to confirm whether there ware some reasons or not. :)

dmikurube commented 7 years ago

@muga Seeing Jackson source code.

The key points are numeric <=> text. If we'd like to control how numerics are formatted to texts, and how texts are parsed to numerics, we'd use *Value and handle conversions by ourselves by checking types beforehand.

dmikurube commented 7 years ago

asBoolean may be the most clear example.

If the library is requested to interpret a text in JSON as boolean in Java, Jackson's asBoolean recognizes only "true" and "false" -- no TRUE, False, or 0.

It happens when the following JSON is responded:

{ "flag": "True" }

and the plugin is implemented:

        return SchemaWriterFactory.builder()
                .add("flag", Types.BOOLEAN)
dmikurube commented 7 years ago

One idea is just following Jackson's as* (meaning we assume only "true" and "false") in the library of ours.

If developers need other kinds of true/false, it's somewhat possible to override by extending ServiceValue and ServiceRecord by themselves. I guess it is not a very popular case...

Since as* methods don't fail for typing reasons, error handling may not be needed.

muga commented 7 years ago
If developers need other kinds of true/false, it's somewhat possible to override by extending ServiceValue and ServiceRecord by themselves. I guess it is not a very popular case...

Since as* methods don't fail for typing reasons, error handling may not be needed.

Agree. Let's use asXXX methods in Jackson for now.

muga commented 7 years ago

Do you have good idea to convert them directly? I didn't know about it at that time. We should take it if you have.

No special ideas at the moment. I wanted to confirm whether there ware some reasons or not. :)

Yea, we could extend msgpack-core or jackson-binding to do that. But now they don't provide it yet.

dmikurube commented 7 years ago

Got it. #31 is merged. Closing. Filed #32 just for the future.