SpectoLabs / hoverfly

Lightweight service virtualization/ API simulation / API mocking tool for developers and testers
https://hoverfly.io
Apache License 2.0
2.35k stars 208 forks source link

Character Encoding Causes Content-Length Headers Failure #818

Open unverified-contact opened 5 years ago

unverified-contact commented 5 years ago

There is an issue with interpreting "special characters" in the response body when exporting/importing simulations when the Content-Length header is present. I have observed that this won't occur when I manually remove the Content-Length field from the simulation file.

This can be demonstrated with a simple request to Google:

moth@debian:~/hoverfly$ ./hoverctl start
Hoverfly is now running

+------------+------+
| admin-port | 8888 |
| proxy-port | 8500 |
+------------+------+

moth@debian:~/hoverfly$ ./hoverctl mode capture --all-headers
Hoverfly has been set to capture mode and will capture all request headers

moth@debian:~/hoverfly$ curl -L --proxy localhost:8500 http://google.com --cacert cert.pem
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-AU"><head><meta content="text/html;>
[TRUNCATED FOR CLEANLINESS]

moth@debian:~/hoverfly$ ./hoverctl export simulation.json
Successfully exported simulation to simulation.json

moth@debian:~/hoverfly$ ./hoverctl stop
Hoverfly has been stopped

moth@debian:~/hoverfly$ ./hoverctl start
Hoverfly is now running

+------------+------+
| admin-port | 8888 |
| proxy-port | 8500 |
+------------+------+

moth@debian:~/hoverfly$ ./hoverctl import simulation.json
WARNING: Response contains incorrect Content-Length header on data.pairs[1].response, please correct or remove header

simulation_google.json

Now, the warning itself isn't so much of a problem seemingly (it's just a warning, and a valid one) but curl fails on that endpoint thereafter in simulation mode:

moth@debian:~/hoverfly$ curl -L --proxy localhost:8500 http://google.com --cacert cert.pem
curl: (18) transfer closed with 11194 bytes remaining to read

I spoke to @tommysitu on Glitter regarding this and he requested that I create an issue to explore this further. His comments regarding the problem:

Tommy Situ @tommysitu 03:33 It's to do with the encoding. When the body is exported and imported again, the string with special character Advertising\ufffdProgrammes is converted to Advertising�Programmes which results a mismatch content-length

Is this something that hoverfly should handle and/or is there something I should do here to handle these cases in a better way than just removing the Content-Length header for simulations where I encounter this problem?

tommysitu commented 5 years ago

You can try requesting https://google.com: curl -L --proxy localhost:8500 https://google.com --cacert cert.pem

because when Hoverfly is handling HTTPS MITM proxy, the response is written by chunked transfer encoding.

tommysitu commented 5 years ago

This issue happens during importing. The json unmarshaller convert \ufffd to changes the length of the body. The stringify rune uses more bytes I think?

One possible fix is to make hoverfly returns a correct Content-Length header value by ignoring the one in the simulation. This could be desirable if user is already doing templating or applying middleware that changes the body content. However this fix might break any existing users' tests if they check content length value as it is.