varnish / libvmod-curl

cURL Varnish bindings by Varnish Software
Other
48 stars 39 forks source link

#012 character occurring (instead of) CRLF in response body #15

Closed peeceeprashant closed 8 years ago

peeceeprashant commented 10 years ago

Hi,

I download and installed the library as per instructions. When I do a request on curl, the response body (when invoking curl.body(URL)) returns the BODY along with the "#012" character for every occurrence of CRLF.

I was advised to raise a ticket when I posted about the same in the varnish-cache forum.

aondio commented 8 years ago

Hi @peeceeprashant , sorry for the very late reply. Is this problem still present? Do you have a varnishlog for it?

aondio commented 8 years ago

Timed out. Closing.

lkarsten commented 8 years ago

I think this warrants another look.

Please extend test01.vtc to make sure this isn't the case.

aondio commented 8 years ago

Not sure if that's exactly what the reporter is describing. I've tested various type of response bodies handled both by Varnish and the curl vmod. Everything seems to work as expected. https://github.com/aondio/libvmod-curl/commit/407ca532830f18babf971fa20a1116dc02525da6

aondio commented 8 years ago

Anyone have the chance to review the commit from the previous message?

aondio commented 8 years ago

I´ve been pondering about this and came to the conclusion that we can't put a response body into a header(which is exactly what the curl.body() functions does). A response body can contain chars that are not allowed in a header, for instance CRLF. On the top of this there is another question that needs to be evaluated: what happen if the response body is a binary?

Unfortunately I don't have a proper answer for this question, and TBH I can't think of a good use case where a user might want to use the curl.body() function to curl a binary body. Said so, let's assume we want to use the curl.body() function only for non binary bodies, this means that we want t curl the response body for two possible reasons:

  1. we just want to render the response body
  2. we want to have the response body available as a unique string to manipulate it
  3. This can be easily fixed: synthetic(curl.body()) creates a synthetic response and, of course, it is able to handle special characters
  4. In this case the idea is to parse the response body and get rid of the chars that should not be in a header.
gquintard commented 8 years ago

shouldn't the return type for body() be a blob?

fgsch commented 8 years ago

@gquintard that might or might not be needed but it will break VCLs relaying on body() returning text. In any case, this is not specific to this issue and we should address it, if anything, separately.

aondio commented 8 years ago

I haven't managed to recreate the original problem, neither on branch 3.0 nor on 4.x, therefore I've reviewed the source code and indeed there is the possibility that the body() function behaves strange if the response body contains special chars. I've now documented this behaviour in the above PR.

This ticket can be closed once the PR gets merged.