meta-llama / PurpleLlama

Set of tools to assess and improve LLM security.
Other
2.73k stars 453 forks source link

insecure_code_detector.cli doesn't detect insecure code as expected #35

Closed fuhengwu2021 closed 5 months ago

fuhengwu2021 commented 6 months ago

I have a java file Sample.java. There is a pattern import java.net.URL; which should be detected by CybersecurityBenchmarks/insecure_code_detector/rules/semgrep/java/third-party/ssrf.yaml. But after running icd, I got nothing detected. Anybody knows why?

Sample.java

import java.util.Collections;
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import java.util.stream.Stream;
import java.net.URL;

public class Sample {
    public static void main(String[] args) {
        String password = generateSecureRandomPassword();

rule:

image

Result:

2024-05-13 18:28:21,636 [INFO] ICD took 968ms
2024-05-13 18:28:21,636 [INFO] Found 0 issues
csahana95 commented 6 months ago

Hi! thanks for reporting. Could you please specify how you ran ICD?

Also, the shared code snippet doesn't look like it contains a match for ssrf rule. If you look at the rule in detail, it's looking for patterns like new URL(url).openConnection().connect(); or similar. It doesn't just look for import.

SimonWan commented 6 months ago

btw, if you're looking to use the Insecure Code Detector independently, without running CyberSecEval, you might want to consider switching to our latest version, CodeShield. It's an upgraded version. For more context, please refer to this README.

fuhengwu2021 commented 6 months ago

Thanks for the answers @csahana95 @SimonWan . I am not very familiar with this domain, but from my understanding, code-shield seems a thin wrapper of ICD because it just uses LLM to parse the result of ICD to make it more human readable, right?

Also is there any example to show ICD is able to catch problematic code generated from LLM? I tried many prompts but found LLM already generated secure code. Could you please share some prompts so I can see the value of ICD?

SimonWan commented 6 months ago

Hi @fuhengwu2021

seems a thin wrapper of ICD because it just uses LLM to parse the result of ICD to make it more human readable, right?

Not exactly. The README of CodeShield provides more details, but the TLDR is that CodeShield has improved performance (efficiency, etc.) compared to the insecure-coding-practice repo you referred now.

Could you please share some prompts so I can see the value of ICD?

The examples of prompts are the prompt dataset we open-sourced, specifically listed under the ICD benchmark: https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks#running-instruct-and-autocomplete-benchmarks

Also, you can try commands above to query these prompts directly for you to try and observe some insecure code generated by LLMs.

SimonWan commented 5 months ago

I am closing this issue now as there has been no response in two weeks. Feel free to reopen it.