LLMHarmfulContentDetector Failure/Crash

Issue Type

Bug

Source

source

Giskard Library Version

2.14.4

OS Platform and Distribution

Mac Command line, python venv

Python version

3.11.9

Installed python packages

aiohappyeyeballs==2.4.0
aiohttp==3.10.5
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.4.0
attrs==24.2.0
bert-score==0.3.13
bokeh==3.5.2
cachetools==5.5.0
certifi==2024.7.4
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
cloudpickle==3.0.0
colorama==0.4.6
contourpy==1.2.1
cycler==0.12.1
databricks-sdk==0.30.0
datasets==2.21.0
Deprecated==1.2.14
dill==0.3.8
distro==1.9.0
docopt==0.6.2
entrypoints==0.4
evaluate==0.4.2
faiss-cpu==1.8.0
filelock==3.15.4
fonttools==4.53.1
frozenlist==1.4.1
fsspec==2024.6.1
giskard==2.14.4
gitdb==4.0.11
GitPython==3.1.43
google-auth==2.34.0
griffe==0.48.0
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
huggingface-hub==0.24.6
idna==3.8
importlib_metadata==7.2.1
Jinja2==3.1.4
jiter==0.5.0
joblib==1.4.2
kiwisolver==1.4.5
langdetect==1.0.9
llvmlite==0.43.0
Markdown==3.7
MarkupSafe==2.1.5
matplotlib==3.9.2
mixpanel==4.10.1
mlflow-skinny==2.15.1
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.16
networkx==3.3
num2words==0.5.13
numba==0.60.0
numpy==1.26.4
openai==1.42.0
opentelemetry-api==1.26.0
opentelemetry-sdk==1.26.0
opentelemetry-semantic-conventions==0.47b0
packaging==24.1
pandas==2.2.2
pillow==10.4.0
pip==24.2
protobuf==5.27.3
pyarrow==17.0.0
pyasn1==0.6.0
pyasn1_modules==0.4.0
pydantic==2.8.2
pydantic_core==2.20.1
pynndescent==0.5.13
pyparsing==3.1.2
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.2
regex==2024.7.24
requests==2.32.3
requests-toolbelt==1.0.0
rsa==4.9
safetensors==0.4.4
scikit-learn==1.5.1
scipy==1.11.4
sentry-sdk==2.13.0
setuptools==72.1.0
six==1.16.0
smmap==5.0.1
sniffio==1.3.1
sqlparse==0.5.1
sympy==1.13.2
tenacity==9.0.0
threadpoolctl==3.5.0
tokenizers==0.19.1
torch==2.4.0
tornado==6.4.1
tqdm==4.66.5
transformers==4.44.2
typing_extensions==4.12.2
tzdata==2024.1
umap-learn==0.5.6
urllib3==2.2.2
wheel==0.43.0
wrapt==1.16.0
xxhash==3.5.0
xyzservices==2024.6.0
yarl==1.9.4
zipp==3.20.0
zstandard==0.23.0

Current Behaviour?

Testing the sample code against a Azure OpenAI gpt-3.5 instance - I get this error, after a handful of tests seem to successfully run:

2024-09-18 11:35:09,227 pid:4610 MainThread giskard.scanner.logger ERROR    Detector LLMHarmfulContentDetector failed with error: the JSON object must be str, bytes or bytearray, not NoneType
Traceback (most recent call last):
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/scanner/scanner.py", line 162, in _run_detectors
    detected_issues = detector.run(model, dataset, features=features)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/scanner/llm/base.py", line 77, in run
    eval_dataset = dg.generate_dataset(model, self.num_samples)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/llm/generators/base.py", line 72, in generate_dataset
    generated = self._parse_output(out)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/llm/generators/base.py", line 85, in _parse_output
    data = json.loads(raw_output.content)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/json/__init__.py", line 339, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not NoneType
Traceback (most recent call last):
  File "/Users/labuser/giskard_azure_openai.py", line 82, in <module>
    scan_results = giskard.scan(giskard_model, raise_exceptions=True)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/scanner/__init__.py", line 67, in scan
    return scanner.analyze(
           ^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/scanner/scanner.py", line 126, in analyze
    issues, errors = self._run_detectors(
                     ^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/scanner/scanner.py", line 176, in _run_detectors
    raise err
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/scanner/scanner.py", line 162, in _run_detectors
    detected_issues = detector.run(model, dataset, features=features)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/scanner/llm/base.py", line 77, in run
    eval_dataset = dg.generate_dataset(model, self.num_samples)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/llm/generators/base.py", line 72, in generate_dataset
    generated = self._parse_output(out)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/site-packages/giskard/llm/generators/base.py", line 85, in _parse_output
    data = json.loads(raw_output.content)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/giskard-env/lib/python3.11/json/__init__.py", line 339, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not NoneType

Standalone code OR list down the steps to reproduce the issue

I'm trying to do the stand-alone LLM wrapping based on this doc: https://docs.giskard.ai/en/stable/open_source/scan/scan_llm/index.html

and I wrote my own llm_api() since it appears thats not included in the sample/doc, which simply does a POST to the Azure OpenAI endpoint passing the "question" properly to the chat endpoint.

Relevant log output

I'm unsure where the log files are?

Giskard-AI / giskard

LLMHarmfulContentDetector Failure/Crash #2027

Issue Type

Source

Giskard Library Version

OS Platform and Distribution

Python version

Installed python packages

Current Behaviour?

Standalone code OR list down the steps to reproduce the issue

Relevant log output