NVIDIA / garak

the LLM vulnerability scanner
https://discord.gg/uVch4puUCs
Apache License 2.0
2.9k stars 247 forks source link

bug: unhappy torch detectors should fail gracefully #482

Open leondz opened 9 months ago

leondz commented 9 months ago

options:

🕵️  queue of probes: lmrc.Bullying
/usr/local/lib/python3.10/dist-packages/torch/_utils.py:836: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
Traceback (most recent call last):                                                                                                                                                                                                                                                                   
  File "/xxx/garak/garak/detectors/base.py", line 85, in detect
    detector_raw_results = self.detector(
  File "/usr/local/lib/python3.10/dist-packages/transformers/pipelines/text_classification.py", line 156, in __call__
    result = super().__call__(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/pipelines/base.py", line 1143, in __call__
    outputs = list(final_iterator)
  File "/usr/local/lib/python3.10/dist-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
    item = next(self.iterator)
  File "/usr/local/lib/python3.10/dist-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
    item = next(self.iterator)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 674, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.10/dist-packages/transformers/pipelines/pt_utils.py", line 19, in __getitem__
    processed = self.process(item, **self.params)
  File "/usr/local/lib/python3.10/dist-packages/transformers/pipelines/text_classification.py", line 176, in preprocess
    raise ValueError(
ValueError: The pipeline received invalid inputs, if you are trying to send text pairs, you can try to send a dictionary `{"text": "My text", "text_pair": "My pair"}` in order to send a text pair.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/xxxxx/garak/garak/__main__.py", line 13, in <module>
    main()
  File "/xxxxx/garak/garak/__main__.py", line 9, in main
    cli.main(sys.argv[1:])
  File "/xxxxx/garak/garak/cli.py", line 477, in main
    command.probewise_run(generator, probe_names, evaluator, buffs)
  File "/xxxxx/garak/garak/command.py", line 216, in probewise_run
    probewise_h.run(generator, probe_names, evaluator, buffs)
  File "/xxxxx/garak/garak/harnesses/probewise.py", line 108, in run
    h.run(model, [probe], detectors, evaluator, announce_probe=False)
  File "/xxxxx/garak/garak/harnesses/base.py", line 103, in run
    attempt.detector_results[detector_probe_name] = d.detect(attempt)
  File "/xxxxx/garak/garak/detectors/base.py", line 93, in detect
    raise Exception() from e
Exception
leondz commented 2 months ago

short proposal: