facebook / mariana-trench

A security focused static analysis tool for Android and Java applications.
https://mariana-tren.ch/
MIT License
1.1k stars 139 forks source link

Regular expression models for literals #139

Closed pkesseli closed 1 year ago

pkesseli commented 1 year ago

As discussed in the workplace group I added support for source models for literals matching configurable regular expressions. The models.md documentation outlines what that would look like:


Example literal models:

[
  {
    "pattern": "SELECT \\*.*",
    "description": "Potential SQL Query",
    "sources": [
      {
        "kind": "SqlQuery"
      }
    ]
  },
  {
    "pattern": "AI[0-9A-Z]{16}",
    "description": "Suspected Google API Key",
    "sources": [
      {
        "kind": "GoogleAPIKey"
      }
    ]
  }
]

Example code:

void testRegexSource() {
  String prefix = "SELECT * FROM USERS WHERE id = ";
  String aci = getAttackerControlledInput();
  String query = prefix + aci; // Sink
}

void testRegexSourceGoogleApiKey() {
  String secret = "AIABCD1234EFGH5678";
  sink(secret);
}

Reviewer note

In order to test these changes, I added new classes and methods to source/tests/integration/end-to-end/library_classes, which causes a significant amount of unrelated changes to other tests' expected_xxx.json. I tried to constrain those changes to the commit Update library_classes.

pkesseli commented 1 year ago

I also realised since I updated library_classes I also affected source/tests/integration/end-to-end/code/check_cast_types, which doesn't get run by default in tests.yml. I updated those expected JSON files in my latest push as well now.

I manually compiled check_cast_types/KotlinCheckCast.kt and added it to the JAR - not sure whether that's intended way of running this test, but it produced the same JSON output files again except for my new models in library_classes. 🙂

facebook-github-bot commented 1 year ago

@yuhshin-oss has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

pkesseli commented 1 year ago

Ugh, couple of clangtidy/format issues showing up internally that's too tedious to type all of them here. If you can address the RE2 one, we should be able to auto-format after importing

Thanks, all addressed. I wasn't sure how to run the linter myself, otherwise I would have checked for that!

facebook-github-bot commented 1 year ago

@yuhshin-oss has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 year ago

@arthaud has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 year ago

@arthaud merged this pull request in facebook/mariana-trench@c266c18820d0a355ef5555f3f8ee77bb5bbe5c57.