perwendel / spark

A simple expressive web framework for java. Spark has a kotlin DSL https://github.com/perwendel/spark-kotlin
Apache License 2.0
9.63k stars 1.56k forks source link

Code to handle non-UTF8 paths never reached #1030

Open stevemcleod opened 6 years ago

stevemcleod commented 6 years ago

We get occasional errors in production as follows:

org.eclipse.jetty.util.NotUtf8Exception: Not valid UTF8! byte C0 in state 0
    at spark.utils.urldecoding.Utf8Appendable.appendByte(Utf8Appendable.java:148)
    at spark.utils.urldecoding.Utf8Appendable.append(Utf8Appendable.java:116)
    at spark.utils.urldecoding.UrlDecode.path(UrlDecode.java:51)
    at spark.utils.urldecoding.UrlDecode.path(UrlDecode.java:26)
    at spark.Request.getParams(Request.java:503)
    at spark.Request.changeMatch(Request.java:123)
    at spark.Access.changeMatch(Access.java:31)
    at spark.http.matching.RequestWrapper.changeMatch(RequestWrapper.java:51)
    at spark.http.matching.Routes.execute(Routes.java:56)
    at spark.http.matching.MatcherFilter.doFilter(MatcherFilter.java:130)
    at spark.embeddedserver.jetty.JettyHandler.doHandle(JettyHandler.java:50)

I think the problem lies in two different implementations of NotUtf8Exception being used:

spark.utils.urldecoding.Utf8Appendable:91 throws an instance of org.eclipse.jetty.util.Utf8Appendable.NotUtf8Exception

But spark.utils.urldecoding.UrlDecode:91 catches spark.utils.urldecoding.Utf8Appendable.NotUtf8Exception

As far as I can tell, spark.utils.urldecoding.Utf8Appendable.NotUtf8Exception is never instantiated at all in the Spark code base.

As a result, the fallback decodeISO88591Path method in spark.utils.urldecoding.UrlDecode for handling ISO-8859-1 paths is never used.

I'm happy to submit a pull request for this myself. But it would be nice if someone with a better understanding of the relevant code can confirm whether my analysis is correct.

stevemcleod commented 3 years ago

This is perhaps a non-problem, at least as far as Spark is concerned.

Jetty throws these exceptions in response to invalid URIs. I think what I'm seeing is Jetty refusing to accept invalid URIs.

An example bad URI is example.com/?%%0

We see a lot of these exceptions in our logs. It could be that these bad URIs are in the arsenal of "script kiddy" tools.

I think this could be closed.