Kong / unirest-java

Unirest in Java: Simplified, lightweight HTTP client library.
http://kong.github.io/unirest-java/
MIT License
2.59k stars 593 forks source link

Do not Escape HTML in JSON #375

Closed pedr0-fr closed 3 years ago

pedr0-fr commented 3 years ago

Describe the bug My local http server replies to the endpoint below with {"data":{"test":"it's a && b || c + 1!?"},"status":"success"} and the HTTP header Content-Type" is set to "application/json;charset=utf-8". However, Unirest doesn't decode the answer with UTF-8.

To Reproduce Here is my code:

public static void main(String[] args) throws Exception{
        Unirest.config().verifySsl(false);
        HttpResponse<JsonNode> jsonNode = Unirest.get("https://localhost:8443/status").asJson();
        System.out.println(jsonNode.getBody());
    }

Which prints {"data":{"test":"it\u0027s a \u0026\u0026 b || c + 1!?"},"status":"success"}

I am not using HTTP compression or anything else that I could think of that could break this. If I extract a HttpResponse<byte[]> instead and decode it with UTF-8 manually I obtain the expected result.

Expected behavior Decode the answer correctly and get `{"data":{"test":"it's a && b || c + 1!?"},"status":"success"} `

Environmental Data:

Additional context The artifact I am using is

<dependency>
            <groupId>com.konghq</groupId>
            <artifactId>unirest-java</artifactId>
            <version>3.11.01</version>
</dependency>
ryber commented 3 years ago

This doesn't actually have anything to do with UTF-8. It's caused by extra encoding done by GSON by default (see https://stackoverflow.com/questions/4147012/can-you-avoid-gson-converting-and-into-unicode-escape-sequences)

I have to say I'm a little perplexed that Google would have this be the default behavior. I would expect a pure JSON reference spec config to be the default.

I'll get a patch to that effect probably this weekend.

Keep in mind that there are still some characters in JSON that must be escaped always (like double quotes, new lines, tabs, etc)

ryber commented 3 years ago

Complete in 3.11.02

pedr0-fr commented 3 years ago

Thanks a lot for the quick fix @ryber By the way, is there a way to use gzip HTTP compression? Something like Unirest.post("https://example.org/foo").setBody("bar).setHTTPcompression("gzip") ?

ryber commented 3 years ago

Unirest already automatically handles gzip'd responses from servers. Gziping a request body is pretty uncommon, as far as I know, none of the clients offer that functionality out of the box. Client requests to a server (other than files) tend to be pretty small and taking the time to gzip bodies might be counter-performant. How large of bodies are you sending?

pedr0-fr commented 3 years ago

Could be up to 10 MB of hexadecimal data, which would be highly compressable.