mitre / HTTP-Proxy-Servlet

Smiley's HTTP Proxy implemented as a Java servlet
Apache License 2.0
1.46k stars 555 forks source link

Incompatibility with servlet filters that consume the inputStream #54

Open ppsimatikas opened 9 years ago

ppsimatikas commented 9 years ago

This proxy does not support POST form requests. If you do a POST request in the proxy with content type "application/x-www-form-urlencoded" the proxy gets stuck. After a lot of debugging I figured out that the servletRequest.getInputStream() is empty when the content type is "application/x-www-form-urlencoded".

In the ProxyServlet class under the else statement of the code bellow in service(...) method, it is trying to copy to the proxyRequest the body of the servletRequest. This body is empty so the line proxyResponse1 = this.proxyClient.execute(this.getTargetHost(servletRequest), (HttpRequest)proxyRequest); hangs.

if(servletRequest.getHeader("Content-Length") == null && servletRequest.getHeader("Transfer-Encoding") == null) { proxyRequest = new BasicHttpRequest(method, proxyRequestUri); } else { BasicHttpEntityEnclosingRequest proxyResponse = new BasicHttpEntityEnclosingRequest(method, proxyRequestUri); proxyResponse.setEntity(new InputStreamEntity(servletRequest.getInputStream(), (long)servletRequest.getContentLength())); proxyRequest = proxyResponse; }

A workaround is to manually copy the form body of the servletRequest, servletRequest.getParameterMap().

dsmiley commented 9 years ago

Hi, This is curious behavior you are seeing. AFAIK, a servlet container should not consume the inputStream (thus making it unavailable to the servlet) unless the servlet calls servletRequest.getParameter(name) (or a few related methods) since those methods need to consume the inputStream in order to determine what the parameters are. The servlet is careful not to call those methods. The URITemplateProxyServlet subclass has a comment exactly explaining why it can't call one of those methods. I suspect that you have a servlet filter consuming it, or maybe you've customized the servlet? Which servlet engine are you running?

I wish the test utilities used by tests for this servlet used a real servlet engine like Jetty instead of the test harness which may be convenient but isn't 100% like what a real servlet engine does, including it doesn't consume the stream when I call a getParameter* method. Oh well.

dmpotter44 commented 9 years ago

I can verify that this happens under Tomcat 7.0.57. For whatever reason, when data is POSTed using application/x-www-form-urlencoded, the InputStream is empty. (Fortunately in my use case the content-type header wasn't being set properly and should have been application/json.)

But this creates a timeout error on the other end when it sees a Content-Length header and never receives any data.

Edit: Although on second thought, this may be due to the Spring Security filters that every request is run through. The entire point of proxying the request is to ensure the underlying web service isn't publicly accessible except through the Spring Security filters, so they can't be removed.

dsmiley commented 9 years ago

Hi Dan. Hmmm. Maybe the solution is a work-around like what the O.P. said. If we have a Content-Length header of non-zero, yet we don't see any input, and the content type is application/x-www-form-urlencoded, then we can re-generate a suitable input from the parameters. If the content type is JSON; I don't think there's anything the servlet can do. At least log a warning, I guess. Any other ideas?

In any case, I have no time to work on this but I'd be happy to accept a pull request if we agree on a solution, with a test (ideally).

dmpotter44 commented 9 years ago

I'll see what I can do - the initial run-in with the bug was due to a POST request that was actually JSON being eaten, so changing the Content-Type to be correct fixed the issue. However, there's another portion of the project that may involve actually proxying POSTed forms so when I get to that stage I'll see what I can do to generate a proper work-around with test case.

ppsimatikas commented 9 years ago

Apologies for the late request. I am using the Servlet within a Spring Boot framework and running on top of Tomcat 8 server. It might be Spring boots weird behavior with other servlets consuming the body in the filter chain...

I had to do multiple changes to make it work within Spring Boot. For example, during copyResponseEntity I had to flush the servletOutputStream.

chang-chao commented 9 years ago

1.The body cannot be consumed before HTTP-Proxy-Servlet is called. in the case of application/x-www-form-urlencoded Content-Type ,the body will be consumed when the getParameterXXX is called.

2.When using with spring boot,by default the OrderedHiddenHttpMethodFilter will be applied,where getParameter is called,before the HTTP-Proxy-Servlet service method.

3.There are so many filters or inteceptors of spring where getParameter could be called,so personallly I wound not recommend using HTTP-Proxy-Servlet with spring,let alone spring boot.Although I do like them both.

jackielii commented 7 years ago

My solution here is to reconstruct post form using apache UrlEncodedFormEntity:

  protected HttpRequest newProxyRequestWithEntity(
      String method, String proxyRequestUri, HttpServletRequest servletRequest) throws IOException {
    HttpEntityEnclosingRequest eProxyRequest =
        new BasicHttpEntityEnclosingRequest(method, proxyRequestUri);

    String contentType = servletRequest.getContentType();

    boolean isFormPost =
        (contentType != null
            && contentType.contains("application/x-www-form-urlencoded")
            && "POST".equalsIgnoreCase(servletRequest.getMethod()));

    if (isFormPost) {
      List<NameValuePair> queryParams = Collections.emptyList();
      String queryString = servletRequest.getQueryString();
      if (queryString != null) {
        queryParams = URLEncodedUtils.parse(queryString, Consts.UTF_8);
      }

      Map<String, String[]> form = servletRequest.getParameterMap();
      List<NameValuePair> params = new ArrayList<>();

      OUTER_LOOP:
      for (Iterator<String> nameIterator = form.keySet().iterator(); nameIterator.hasNext(); ) {
        String name = nameIterator.next();

        // skip parameters from query string
        for (NameValuePair queryParam : queryParams) {
          if (name.equals(queryParam.getName())) {
            continue OUTER_LOOP;
          }
        }

        String[] value = form.get(name);
        if (value.length != 1) {
          throw new RuntimeException("expecting one value in post form");
        }
        params.add(new BasicNameValuePair(name, value[0]));
      }
      eProxyRequest.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));

    } else {
      eProxyRequest.setEntity(
          new InputStreamEntity(servletRequest.getInputStream(), getContentLength(servletRequest)));
    }
    return eProxyRequest;
  }

also fixes #83

my fork: https://github.com/jackielii/HTTP-Proxy-Servlet

dsmiley commented 7 years ago

Cool @jackielii This seems like it would make a nice enhancement to this proxy servlet, particularly if this condition could be detected.

jackielii commented 7 years ago

Not sure we can check the servlets input stream reliable, a quick google tells me that InputStream.available() gives the available bytes to read, but could throw an exception say it's already read.

What do you think?

On Thu, Dec 1, 2016 at 4:12 AM, David Smiley notifications@github.com wrote:

Cool @jackielii https://github.com/jackielii This seems like it would make a nice enhancement to this proxy servlet, particularly if this condition could be detected.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mitre/HTTP-Proxy-Servlet/issues/54#issuecomment-264074320, or mute the thread https://github.com/notifications/unsubscribe-auth/AAWCF_l6hsnZKYu0yaNMcrgdYeVOfCFoks5rDkk8gaJpZM4D7ov6 .

-- Jackie

dsmiley commented 7 years ago

I guess it can't/shouldn't be fully automatic; ah well. Nonetheless it might be a useful mechanism for someone to enable via configuration.

LorandLorincz commented 6 years ago

@dsmiley are you planning on including this in the next release? Thanks

dsmiley commented 6 years ago

Over the past few years, it's the users that drive the additions (PRs), not I, though I steward it and literally do the releases. So, "no" I have no plans for this.

jbaconsult commented 5 years ago

Thanks I came to jackiellis addition. The fork works perfectly. It would be great, if you could merge this somehow to the main repo.

dsmiley commented 5 years ago

It'd be nice to make this an option (default false). Someone could provide a PR here.

EdgewareRoad commented 5 years ago

I don't believe you need anything quite so heavyweight. The component which is consuming the InputStream to create those parameters in this case is the HiddenHttpMethodFIlter.

You can disable this in application.yml (amend accordingly if yml is not your cup of tea) via the following option:

spring:
  mvc:
    hiddenmethod:
      filter:
        enabled: false

This fixes the problem in newProxyRequestWithEntity() without needing code addition/amendment.

EdgewareRoad commented 5 years ago

For clarity, and I'm not sure how this affects the master branch... possibly just an advisory is needed.. is that all this is actually expected behaviour for a servlet according to spec. See section 3.1.1 of the following link.

https://download.oracle.com/otn-pub/jcp/servlet-3.0-fr-eval-oth-JSpec/servlet-3_0-final-spec.pdf

dsmiley commented 5 years ago

Broadly, there are many Java frameworks that work within the servlet container environment that someone might want to use this servlet filter in. Each framework has it's own mechanisms that might consume the input stream and have their own ways of toggling such behavior so as to not interfere. Perhaps what we need is some documentation categorized by framework.

isarwarfp commented 5 years ago

When request uses application/x-www-form-urlencoded and in your request flow if any filter is calling getInputStream once and later you call servletRequest.getParameterMap() it would be empty. Check in your flow getInputStream is already being called.

MatejLiszkaCGI commented 4 years ago

The situation is simple: OrderedHiddenHttpMethodFilter is not a well written filter! No discussions. I was surprised that such a code could be released when I made debugging to discover what happens. Any code that touches the state of the input stream in a servlet filter, MUST implement an adjustment by supplying a substitution of the stream in the request wrapper. This was NOT problem in this proxy servlet (not saying that it is perfect but it is perfectly usable, mainly because all important methods are protected).

MatejLiszkaCGI commented 4 years ago

You can turn off the buggy filter by setting spring.mvc.hiddenmethod.filter.enabled=false since Spring Boot 2.2.2. This option is however not available in older releases of Spring Boot.

packageOk commented 1 year ago

I meet same problem with spring boot. Spring boot autoconfigure FormContentFilter read DELETE method inputstream. I solve it by adding config spring.mvc.formcontent.filter.enabled=false. Hope can help anyone