False positive: HTTP response splitting

JLLeitschuh commented 11 months ago

Description of the false positive

When HttpServletResponse.setHeader is passed untrusted user input, CodeQL always flags this as HTTP Response splitting. However, as far as I can tell, most of the popular servlet containers, like Jetty and Tomcat, both protect against this attack in their implementation. As such, quite a few of these alerts are false positives.

I'm not confident there are any HttpServletResponse implementations out there that don't currently guard against HTTP Response Splitting.

I think this query needs to be readdressed in the current ecosystem to ensure that it's actually valid in the current state of the world.

If it remains valid, then the list of servlet containers that are still vulnerable should be documented in the CodeQL alert. Additionally, I'd encourage the GitHub security lab team to engage in outreach to report to any servlet containers that are still not adequately protecting against this attack and get CVE's where appropriate.

### Tasks
- [ ] Determine which servlet containers are still vulnerable to HTTP Response Splitting
- [ ] List vulnerable servlet containers in the CodeQL documentation
- [ ] Engage in vulnerablilty reporting to remaining vulnerable OSS servlet containers
- [ ] Add a sanitizer for CRLF injection when the user-input flows throguh a `new File` constructor

JLLeitschuh commented 11 months ago

Example of FP. This project uses Jetty by default, but can be used with alternative servlet handlers if desired.

https://github.com/apache/hop/blob/ab25f4419f058caed9f9c3d48e910f4c5068929b/rap/src/main/java/org/apache/hop/ui/hopgui/DownloadServiceHandler.java#L57-L57

JLLeitschuh commented 11 months ago

Also, I'm fairly confident that java won't let you CRLF inject into a new File, so this is a further FP for that reason:

https://github.com/apache/hop/blob/ab25f4419f058caed9f9c3d48e910f4c5068929b/rap/src/main/java/org/apache/hop/ui/hopgui/DownloadServiceHandler.java#L56C1-L56C1

aibaars commented 11 months ago

Thanks for reporting! You're right that this is much less dangerous then it used to be now most servlet containers have builtin protection against response splitting. I'll ask the team to evaluate the query.

Personally, I would still validate user input before putting it into an HTTP header, and only rely on the servlet container as a last line of defence. Even though it might no longer be response splitting, It could still cause a XSS or other vulnerability. And, you never know, someone might run the code with an ancient version of Tomcat.

Also, I'm fairly confident that java won't let you CRLF inject into a new File, so this is a further FP for that reason:

As far as I know CR and LF are perfectly valid characters in filenames on Linux.

JLLeitschuh commented 11 months ago

Personally, I would still validate user input before putting it into an HTTP header, and only rely on the servlet container as a last line of defence.

Wouldn't that code be better suited being centralized on a custom implementation of a

And, you never know, someone might run the code with an ancient version of Tomcat

While I agree with this sentiment, this seems like it's counter to the current mission of CodeQL today. To quote from @xcorail:

Only submissions with an acceptable false positive rate are accepted, and only the ones with a very good precision – annotated with a high or very-high precision metadata – are eventually promoted to the standard queries set, the one that runs by default with GitHub code scanning and publish alerts within the pull requests. - https://github.blog/2022-01-05-how-the-community-powers-github-advanced-security-with-codeql-queries/

If this is only left in for very old servlets, this is likely a very high false positive query. In that light, does this query continue to clear the bar that would support it being included in the default query set under todays acceptance criteria?

If the answer to this remains "yes", then it would be very helpful if the query listed the conditions under which this vulnerability remained accurate, for example, listing the versions of Tomcat, or other servlet containers that are still vulnerable.

As far as I know CR and LF are perfectly valid characters in filenames on Linux.

Did not know that, will explore! Much appreciated!

xcorail commented 11 months ago

it's counter to the current mission of CodeQL

Not really. You can use CodeQL 2 ways:

Standard set into the developers CI (this is the usage I was referring to in the quote above)
Extended set / Custom queries that are used by security researchers or AppSec teams to find issues. In the case raised by Arthur (older version of Tomcat), a security team could be interested by this query

JLLeitschuh commented 11 months ago

Does this query fall into the extended set? Or the default set?

github / codeql

False positive: HTTP response splitting #15056