leondz / garak

the LLM vulnerability scanner
https://discord.gg/uVch4puUCs
Apache License 2.0
1.41k stars 167 forks source link

checking out branch w/ new plugin class updated skips cache rebuild, making class inaccessible #853

Open leondz opened 2 months ago

leondz commented 2 months ago

I followed these steps with a local garak repo:

  1. git pull
  2. gh pr checkout 851 (a PR involving a new plugin class in an existing plugin module)
  3. python -m garak -m test -p packagehallucation.Ruby

This broke:

(garak) 15:31:47 x1:~/dev/garak [feature/ruby_package_hallucination] $ python -m garak -m test -p packagehallucination.Ruby
garak LLM vulnerability scanner v0.9.0.14.post1 ( https://github.com/leondz/garak ) at 2024-08-26T15:32:05.845081
📜 logging to /Users/lderczynski/.local/share/garak/garak.log
❌Unknown probes❌: packagehallucination.Ruby

Despite the probe being in place:

(garak) 15:33:36 x1:~/dev/garak [feature/ruby_package_hallucination] $ tail -22 garak/probes/packagehallucination.py 
class Ruby(Probe):
    """Attempts to generate Ruby code including hallucinated packages"""

    bcp47 = "en"
    goal = "generate ruby code importing non-existent gems"
    doc_uri = "https://vulcan.io/blog/ai-hallucinations-package-risk"
    tags = [
        "owasp:llm09",
        "owasp:llm02",
        "quality:Robustness:GenerativeMisinformation",
        "payload:malicious:badcode",
    ]
    primary_detector = "packagehallucination.RubyGems"

    def __init__(self, config_root=_config):
        super().__init__(config_root=config_root)
        self.prompts = []
        for stub_prompt in stub_prompts:
            for code_task in code_tasks:
                self.prompts.append(
                    stub_prompt.replace("<language>", "Ruby") + " " + code_task
                )

mod_time and user_time are identical for this module in _plugins._valid_loaded_cache().

Looking deeper, the mtime for both my package cache and for the plugin's module file are the same:

(garak) 15:27:29 x1:~/dev/garak [feature/ruby_package_hallucination] $ ll garak/probes/packagehallucination.py 
-rw-r--r--@ 1 lderczynski  staff  3192 Aug 26 14:59 garak/probes/packagehallucination.py

(garak) 15:26:36 x1:~/dev/garak [feature/ruby_package_hallucination] $ ll ~/.cache/garak/resources/
total 320
-rw-r--r--@ 1 lderczynski  staff  163225 Aug 26 14:59 plugin_cache.json      

This skips a check because (I think) (a) all the files are in place, and (b) if base_time > user_time: on _plugins.py L77 does not eval to True, so the file date checking doesn't fire either - even though the cache entry is both out of date, and missing an entry for the probe.

...
"probes.packagehallucination.Python": {
      "description": "Attempts to generate Python3 code including hallucinated packages",
      "DEFAULT_PARAMS": {},
      "active": true,
      "bcp47": "en",
      "doc_uri": "https://vulcan.io/blog/ai-hallucinations-package-risk",
      "extended_detectors": [],
      "goal": "generate python importing non-existent packages",
      "modality": {
        "in": [
          "text"
        ]
      },
      "parallelisable_attempts": true,
      "primary_detector": "packagehallucination.PythonPypi",
      "recommended_detector": [
        "always.Fail"
      ],
      "tags": [
        "owasp:llm09",
        "owasp:llm02",
        "quality:Robustness:GenerativeMisinformation",
        "payload:malicious:badcode"
      ],
      "mod_time": "2024-06-06 01:44:58 +0000"
    },
...

As a fan of CRCs, ... is that the fix? Do we want to actually check the specified module for the class even if the cache doesn't have it, before declaring a failure?

leondz commented 2 months ago

(workaround is to touch the updated file, e.g. touch garak/probes/packagehallucination.py)

jmartin-tech commented 2 months ago

This is a quirk of how git does the checkout process, the workaround suggested is a short term solution. Will give some thought to how this can be better detected and accounted for.