github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.58k stars 1.52k forks source link

Ruby: support sprintf formatted string with modulo operator #15945

Open MaxSchlueter opened 6 months ago

MaxSchlueter commented 6 months ago

I noticed that dataflow in Ruby isn't propagated to Kernel.sprintf formatted strings, e.g. the stored xss query should flag this code in an ERB template:

<%# BAD: Kernel.sprintf modulo operator syntax %>
<%= "Welcome %{user}".html_safe % { user: @user.handle } %>

The string literal is parsed as a a single Ast::StringTextComponent, where it should probably also contain a Ast::StringInterpolationComponent. I tried to work around this problem using an additional taint step:

  predicate isAdditionalSprintfTaintStep(DataFlow::Node node1, DataFlow::Node node2) {
    exists(ModuloExpr expr, HashLiteral hash, StringLiteral str |
      hash.getParent*() = expr.getRightOperand() and
      str.getParent*() = expr.getLeftOperand() and
      hash.getAKeyValuePair().getValue() = node1.asExpr().getExpr() and 
      str = node2.asExpr().getExpr()
    )
  }

which works for the code snippet above, but doesn't work when the dataflow gets a bit more complex:

<% sink = "Welcome %{user}".html_safe %>
<%= sink % { user: @user.handle } %>

I tried the following, but it doesn't work:

  predicate isAdditionalSprintfTaintStep(DataFlow::Node node1, DataFlow::Node node2) {
    exists(ModuloExpr expr, HashLiteral hash, StringLiteral str |
      DataFlow::localExprFlow(hash.getAControlFlowNode(), expr.getRightOperand().getAControlFlowNode()) and
      DataFlow::localExprFlow(str.getAControlFlowNode(), expr.getLeftOperand().getAControlFlowNode()) and
      hash.getAKeyValuePair().getValue() = node1.asExpr().getExpr() and 
      str = node2.asExpr().getExpr()
    )
  }

How do I catch the insecure code snippet above using local dataflow?

aibaars commented 6 months ago

Thanks for reporting this. @alexrford are we missing a model for % and/or sprintf?

alexrford commented 6 months ago

Hi, apologies for the late response. We have modeling for String#% in StringFormatters.qll, the issue here is with the XSS query expecting the taint step to appear before the html_safe call in the dataflow path. In the case of e.g. "Welcome %{user}".html_safe % { user: @user.handle }, the string "Welcome %{user}" is marked as HTML safe before the string is tainted, which the XSS queries currently miss.

I have an improvement for the XSS queries in mind that will address this, but it needs some more testing before it's ready