swimos / swim

Full stack application platform for building stateful microservices, streaming APIs, and real-time UIs
https://www.swimos.org
Apache License 2.0
489 stars 39 forks source link

Header parsing error #90

Open DobromirM opened 2 years ago

DobromirM commented 2 years ago

From the logs of one of the live apps that we have when a web crawler hits it:

error: expected carriage return, but found ','
  --> :11:20
   |
11 | User-Agent: Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com
   |                    ^
        at swim.codec.Parser.error(Parser.java:236)
        at swim.http.HttpRequestParser.parse(HttpRequestParser.java:174)
        at swim.http.HttpRequestParser.feed(HttpRequestParser.java:53)
        at swim.codec.InputParser.parse(InputParser.java:51)
        at swim.codec.InputParser.feed(InputParser.java:32)
        at swim.codec.Parser.feed(Parser.java:149)
        at swim.codec.Parser.feed(Parser.java:101)
        at swim.io.IpSocketModem.doRead(IpSocketModem.java:165)
        at swim.io.TcpSocket.doRead(TcpSocket.java:230)
        at swim.io.StationTransport.doRead(Station.java:588)
        at swim.io.StationReader.runTask(Station.java:878)
        at swim.concurrent.TheaterTask.run(Theater.java:439)
        at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1395)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
        at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
DobromirM commented 2 years ago

This is not unique to the User-Agent header and happens for all headers. Ideally we need to decide if the server should return a malformatted error or discard the invalid headers and process the rest of the request normally.