There are some open-source language specific scanners (like bandint for python) that have pretty good results. We could possibly come up with a system that aggregates output from GPT models and these language specific scanners for better results.
Example of bandit on some application I tested on:
--------------------------------------------------
>> Issue: [B410:blacklist] Using etree to parse untrusted XML data is known to be vulnerable to XML attacks. Replace etree with the equivalent defusedxml package.
Severity: Low Confidence: High
CWE: CWE-20 (https://cwe.mitre.org/data/definitions/20.html)
More Info: https://bandit.readthedocs.io/en/1.7.5/blacklists/blacklist_imports.html#b410-import-lxml
Location: ./anorak/ing_connector/tests/services/test_xsd_validation.py:5:0
4 from assertpy import assert_that
5 from lxml import etree
6
--------------------------------------------------
>> Issue: [B320:blacklist] Using lxml.etree.parse to parse untrusted XML data is known to be vulnerable to XML attacks. Replace lxml.etree.parse with its defusedxml equivalent function.
Severity: Medium Confidence: High
CWE: CWE-20 (https://cwe.mitre.org/data/definitions/20.html)
More Info: https://bandit.readthedocs.io/en/1.7.5/blacklists/blacklist_calls.html#b313-b320-xml-bad-etree
Location: ./anorak/ing_connector/tests/services/test_xsd_validation.py:12:18
11 def test_dict_validation_for_employee_data(self, *_):
12 xml_doc = etree.parse(
13 os.path.join(
14 os.path.dirname(os.path.abspath(__file__)),
15 "../__fixtures/",
16 "xml-sync-test.xml",
17 ).replace("services/attentia/", "")
18 )
19 assert_that(
--------------------------------------------------
>> Issue: [B105:hardcoded_password_string] Possible hardcoded password: 'p@$$123'
Severity: Low Confidence: Medium
CWE: CWE-259 (https://cwe.mitre.org/data/definitions/259.html)
More Info: https://bandit.readthedocs.io/en/1.7.5/plugins/b105_hardcoded_password_string.html
Location: ./anorak/internal_api/tests/views/auth/test_internal_authentication.py:21:35
20 def setUpTestData(cls):
21 cls.alice_clear_password = "p@$$123"
22 cls.alice = fix.create_user(email="alice@foo.bar", password=make_password(cls.alice_clear_password))
There are some open-source language specific scanners (like bandint for python) that have pretty good results. We could possibly come up with a system that aggregates output from GPT models and these language specific scanners for better results.
Example of bandit on some application I tested on: