Giskard-AI / giskard

🐢 Open-Source Evaluation & Testing for ML & LLM systems
https://docs.giskard.ai
Apache License 2.0
4.05k stars 265 forks source link

[GSK-1384] Internal worker exception during model inspection #1214

Closed Inokinoki closed 1 year ago

Inokinoki commented 1 year ago

Issue Type

Bug

Source

source

Giskard Library Version

2.0.0b10

Giskard Server Version

2.0.0b10

OS Platform and Distribution

macOS 13.4.1

Python version

3.9.6

Installed python packages

absl-py==1.4.0
aiohttp==3.8.4
aiosignal==1.3.1
alabaster==0.7.13
anyascii==0.3.2
anyio==3.7.0
appnope==0.1.3
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
astroid==2.15.5
asttokens==2.2.1
astunparse==1.6.3
async-lru==2.0.2
async-timeout==4.0.2
attrs==23.1.0
Babel==2.12.1
backcall==0.2.0
bandit==1.7.5
beautifulsoup4==4.12.2
bert-score==0.3.13
black==23.3.0
bleach==6.0.0
blinker==1.6.2
CacheControl==0.13.1
cachetools==5.3.1
catboost==1.2
certifi==2023.5.7
cffi==1.15.1
cfgv==3.3.1
chardet==5.1.0
charset-normalizer==3.1.0
click==8.1.3
cloudpickle==2.2.1
colorama==0.4.6
comm==0.1.3
contourpy==1.1.0
coverage==7.2.7
cycler==0.11.0
darglint==1.8.1
databricks-cli==0.17.7
dataclasses-json==0.5.8
datasets==2.13.0
debugpy==1.6.7
decorator==5.1.1
defusedxml==0.7.1
deptry==0.11.0
dill==0.3.6
distlib==0.3.6
docker==6.1.3
docutils==0.18.1
dparse==0.6.2
eli5==0.13.0
entrypoints==0.4
evaluate==0.4.0
exceptiongroup==1.1.1
execnet==1.9.0
executing==1.2.0
fastjsonschema==2.17.1
filelock==3.12.2
findpython==0.2.5
flatbuffers==1.12
fonttools==4.40.0
fqdn==1.5.1
frozenlist==1.3.3
fsspec==2023.6.0
furo==2023.5.20
gast==0.4.0
giskard==2.0.0b10
gitdb==4.0.10
GitPython==3.1.31
google-auth==2.20.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
googleapis-common-protos==1.59.1
graphviz==0.20.1
grpcio==1.51.1
grpcio-status==1.48.2
grpcio-tools==1.48.2
h5py==3.8.0
httpretty==1.1.4
huggingface-hub==0.15.1
identify==2.5.24
idna==3.4
imagesize==1.4.1
imbalanced-learn==0.10.1
importlib-metadata==6.6.0
importlib-resources==5.12.0
iniconfig==2.0.0
installer==0.7.0
ipykernel==6.23.2
ipython==8.12.2
ipython-genutils==0.2.0
ipywidgets==8.0.6
isoduration==20.11.0
isort==5.12.0
jedi==0.18.2
Jinja2==3.1.2
joblib==1.2.0
json5==0.9.14
jsonpointer==2.4
jsonschema==4.17.3
jupyter==1.0.0
jupyter_client==8.2.0
jupyter-console==6.6.3
jupyter_core==5.3.1
jupyter-events==0.6.3
jupyter-lsp==2.2.0
jupyter_server==2.6.0
jupyter_server_terminals==0.4.4
jupyterlab==4.0.2
jupyterlab-pygments==0.2.2
jupyterlab_server==2.23.0
jupyterlab-widgets==3.0.7
keras==2.9.0
Keras-Preprocessing==1.1.2
kiwisolver==1.4.4
langchain==0.0.202
langchainplus-sdk==0.0.10
langdetect==1.0.9
lazy-object-proxy==1.9.0
libclang==16.0.0
lightgbm==3.3.5
livereload==2.6.3
llvmlite==0.40.1rc1
lockfile==0.12.2
Markdown==3.4.3
markdown-it-py==3.0.0
MarkupSafe==2.1.3
marshmallow==3.19.0
marshmallow-enum==1.5.1
matplotlib==3.7.1
matplotlib-inline==0.1.6
mccabe==0.7.0
mdit-py-plugins==0.4.0
mdurl==0.1.2
mistune==2.0.5
mixpanel==4.10.0
mlflow-skinny==2.4.1
mpmath==1.3.0
msgpack==1.0.5
multidict==6.0.4
multiprocess==0.70.14
mypy==1.3.0
mypy-extensions==1.0.0
mypy-protobuf==3.3.0
myst-parser==2.0.0
nbclassic==1.0.0
nbclient==0.8.0
nbconvert==7.5.0
nbformat==5.9.0
nbsphinx==0.9.2
nest-asyncio==1.5.6
networkx==3.1
nltk==3.8.1
nodeenv==1.8.0
notebook==6.5.4
notebook_shim==0.2.3
numba==0.57.0
numexpr==2.8.4
numpy==1.23.5
oauthlib==3.2.2
openapi-schema-pydantic==1.2.4
opt-einsum==3.3.0
overrides==7.3.1
packaging==23.1
pandas==1.5.3
pandocfilters==1.5.0
parso==0.8.3
pathspec==0.11.1
pbr==5.11.1
pdm==2.7.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.5.0
pip==23.1.2
platformdirs==3.5.3
plotly==5.15.0
pluggy==1.0.0
pockets==0.9.1
portalocker==2.7.0
pre-commit==3.3.3
prometheus-client==0.17.0
prompt-toolkit==3.0.38
protobuf==3.19.6
psutil==5.9.5
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==12.0.1
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycparser==2.21
pycryptodome==3.18.0
pydantic==1.10.9
pydocstyle==6.3.0
Pygments==2.15.1
PyJWT==2.7.0
pylint==2.17.4
pyngrok==6.0.0
pyparsing==3.0.9
pyproject_hooks==1.0.0
pyrsistent==0.19.3
pytest==7.3.2
pytest-cov==4.1.0
pytest-xdist==3.3.1
python-daemon==2.3.2
python-dateutil==2.8.2
python-dotenv==1.0.0
python-json-logger==2.0.7
pytz==2023.3
pyupgrade==3.6.0
PyYAML==6.0
pyzmq==25.1.0
qtconsole==5.4.3
QtPy==2.3.1
regex==2023.6.3
requests==2.31.0
requests-mock==1.11.0
requests-oauthlib==1.3.1
requests-toolbelt==1.0.0
resolvelib==1.0.1
responses==0.18.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.4.2
rsa==4.9
ruamel.yaml==0.17.32
ruamel.yaml.clib==0.2.7
ruff==0.0.272
safety==2.3.4
scikit-learn==1.0.2
scipy==1.8.1
Send2Trash==1.8.2
setuptools==67.8.0
shap==0.41.0
shellingham==1.5.0.post1
six==1.16.0
slicer==0.0.7
smmap==5.0.0
sniffio==1.3.0
snowballstemmer==2.2.0
soupsieve==2.4.1
Sphinx==6.2.1
sphinx-autoapi==2.1.1
sphinx-autobuild==2021.3.14
sphinx-basic-ng==1.0.0b1
sphinx-click==4.4.0
sphinx-copybutton==0.5.2
sphinx_design==0.4.1
sphinx-rtd-theme==1.2.2
sphinx-tabs==3.4.1
sphinxcontrib-applehelp==1.0.4
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.1
sphinxcontrib-jquery==4.1
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-napoleon==0.7
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
SQLAlchemy==2.0.16
sqlparse==0.4.4
stack-data==0.6.2
stevedore==5.1.0
sympy==1.12
tabulate==0.9.0
tenacity==8.2.2
tensorboard==2.9.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow-estimator==2.9.0
tensorflow-hub==0.13.0
tensorflow-macos==2.9.2
termcolor==2.3.0
terminado==0.17.1
threadpoolctl==3.1.0
tinycss2==1.2.1
tokenize-rt==5.1.0
tokenizers==0.13.3
toml==0.10.2
tomli==2.0.1
tomlkit==0.11.8
torch==2.0.1
torchdata==0.6.1
torchtext==0.15.2
tornado==6.3.2
tqdm==4.65.0
traitlets==5.9.0
transformers==4.29.2
types-protobuf==4.23.0.1
typing_extensions==4.6.3
typing-inspect==0.9.0
unearth==0.9.1
uri-template==1.2.0
urllib3==1.26.16
virtualenv==20.23.1
wcwidth==0.2.6
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.0
Werkzeug==2.3.6
wheel==0.40.0
widgetsnbextension==4.0.7
wrapt==1.15.0
xgboost==1.7.5
xxhash==3.2.0
yarl==1.9.2
zipp==3.15.0
zstandard==0.21.0

Current Behaviour?

A bug happened when inspecting/debugging a model:

'NoneType' object has no attribute 'client'
AttributeError
Traceback (most recent call last):
  File "/Users/***/Builds/giskard/python-client/giskard/ml_worker/utils/request_interceptor.py", line 50, in wrapper
    res = await loop.run_in_executor(pool, behavior, request, context)
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/***/Builds/giskard/python-client/giskard/ml_worker/server/ml_worker_service.py", line 432, in runModel
    self.ml_worker.tunnel.client.log_artifact(
AttributeError: 'NoneType' object has no attribute 'client'

### Standalone code OR list down the steps to reproduce the issue

```shell
Worker related issue:
1. Create a project that would be run by the internal worker (`mlWorkerType == MLWorkerType.INTERNAL`).
2. Upload the models and the datasets.
3. Inspect/Debug a dataset on a model.
4. Get an exception from the internal worker.

Relevant log output

2023-06-30 10:19:49,283 pid:2142 MainThread giskard.commands.cli_worker INFO     Starting ML Worker server
2023-06-30 10:19:49,283 pid:2142 MainThread giskard.commands.cli_worker INFO     Python: /Users/***/Builds/giskard/python-client/.venv/bin/python3.9 (3.9.6)
2023-06-30 10:19:49,283 pid:2142 MainThread giskard.commands.cli_worker INFO     Giskard Home: /Users/***/giskard-home
2023-06-30 10:19:54,135 pid:2142 MainThread giskard.ml_worker.ml_worker INFO     Started ML Worker server on localhost:50051
2023-06-30 10:21:52,661 pid:2142 ml_worker_thread_0 giskard.ml_worker.server.ml_worker_service INFO     Collecting ML Worker info
2023-06-30 10:22:14,166 pid:2142 ml_worker_thread_0 giskard.ml_worker.server.ml_worker_service INFO     Collecting ML Worker info
2023-06-30 10:24:39,087 pid:2142 ml_worker_thread_0 giskard.datasets.base INFO     Casting dataframe columns from {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object', 'default': 'object'} to {'account_check_status': 'object', 'age': 'int64', 'credit_amount': 'int64', 'credit_history': 'object', 'credits_this_bank': 'int64', 'default': 'object', 'duration_in_month': 'int64', 'foreign_worker': 'object', 'housing': 'object', 'installment_as_income_perc': 'int64', 'job': 'object', 'other_debtors': 'object', 'other_installment_plans': 'object', 'people_under_maintenance': 'int64', 'personal_status': 'object', 'present_employment_since': 'object', 'present_residence_since': 'int64', 'property': 'object', 'purpose': 'object', 'savings': 'object', 'sex': 'object', 'telephone': 'object'}
Feature 'people_under_maintenance' is declared as 'numeric' but has 2 (<= category_threshold=2) distinct values. Are you sure it is not a 'category' feature?
2023-06-30 10:24:39,422 pid:2142 ml_worker_thread_0 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:24:39,432 pid:2142 ml_worker_thread_0 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:27:19,831 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Casting dataframe columns from {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object', 'default': 'object'} to {'account_check_status': 'object', 'age': 'int64', 'credit_amount': 'int64', 'credit_history': 'object', 'credits_this_bank': 'int64', 'default': 'object', 'duration_in_month': 'int64', 'foreign_worker': 'object', 'housing': 'object', 'installment_as_income_perc': 'int64', 'job': 'object', 'other_debtors': 'object', 'other_installment_plans': 'object', 'people_under_maintenance': 'int64', 'personal_status': 'object', 'present_employment_since': 'object', 'present_residence_since': 'int64', 'property': 'object', 'purpose': 'object', 'savings': 'object', 'sex': 'object', 'telephone': 'object'}
Feature 'people_under_maintenance' is declared as 'numeric' but has 2 (<= category_threshold=2) distinct values. Are you sure it is not a 'category' feature?
2023-06-30 10:27:19,997 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:27:19,999 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:29:55,958 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Casting dataframe columns from {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object', 'default': 'object'} to {'account_check_status': 'object', 'age': 'int64', 'credit_amount': 'int64', 'credit_history': 'object', 'credits_this_bank': 'int64', 'default': 'object', 'duration_in_month': 'int64', 'foreign_worker': 'object', 'housing': 'object', 'installment_as_income_perc': 'int64', 'job': 'object', 'other_debtors': 'object', 'other_installment_plans': 'object', 'people_under_maintenance': 'int64', 'personal_status': 'object', 'present_employment_since': 'object', 'present_residence_since': 'int64', 'property': 'object', 'purpose': 'object', 'savings': 'object', 'sex': 'object', 'telephone': 'object'}
Feature 'people_under_maintenance' is declared as 'numeric' but has 2 (<= category_threshold=2) distinct values. Are you sure it is not a 'category' feature?
2023-06-30 10:29:56,057 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:29:56,119 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:29:56,122 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
2023-06-30 10:29:56,127 pid:2142 ml_worker_thread_1 giskard.datasets.base INFO     Casting dataframe columns from {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object'} to {'account_check_status': 'object', 'duration_in_month': 'int64', 'credit_history': 'object', 'purpose': 'object', 'credit_amount': 'int64', 'savings': 'object', 'present_employment_since': 'object', 'installment_as_income_perc': 'int64', 'sex': 'object', 'personal_status': 'object', 'other_debtors': 'object', 'present_residence_since': 'int64', 'property': 'object', 'age': 'int64', 'other_installment_plans': 'object', 'housing': 'object', 'credits_this_bank': 'int64', 'job': 'object', 'people_under_maintenance': 'int64', 'telephone': 'object', 'foreign_worker': 'object'}
2023-06-30 10:29:56,131 pid:2142 ml_worker_thread_1 giskard.ml_worker.utils.logging INFO     Predicted dataset with shape (200, 22) executed in 0:00:00.072331
2023-06-30 10:29:56,169 pid:2142 MainThread giskard.ml_worker.utils.request_interceptor ERROR    'NoneType' object has no attribute 'client'
Traceback (most recent call last):
  File "/Users/***/Builds/giskard/python-client/giskard/ml_worker/utils/request_interceptor.py", line 50, in wrapper
    res = await loop.run_in_executor(pool, behavior, request, context)
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/***/Builds/giskard/python-client/giskard/ml_worker/server/ml_worker_service.py", line 432, in runModel
    self.ml_worker.tunnel.client.log_artifact(
AttributeError: 'NoneType' object has no attribute 'client'

GSK-1384

Inokinoki commented 1 year ago

Closed by https://github.com/Giskard-AI/giskard/pull/1212