bentoml / BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
https://bentoml.com
Apache License 2.0
7.17k stars 792 forks source link

Response errors 400 and 500 never reach to `/metrics` endpoint. #2052

Closed ldynia closed 2 years ago

ldynia commented 2 years ago

Bug

Improper error handling in BentoML application. Invalid request courses, 400 and 500 error responses that are never send to /metrics endpoint.

To Reproduce

Below steps where inspired by following BentoML's Hello World example.

1. Install OS dependencies & clone project

$ sudo apt install -y git python3.8-venv
$ git clone https://github.com/bentoml/BentoML.git
$ cd BentoML/guides/quick-start/

2. Setup virtual environment and install dependencies

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

3. Create BentoService

$ bentoml --version
bentoml, version 0.13.1

$ bentoml list
BENTO_SERVICE    AGE    APIS    ARTIFACTS    LABELS

$ python main.py
[2021-11-25 10:38:45,729] INFO - BentoService bundle 'IrisClassifier:20211125103844_A5FBDE' saved to: /home/user/bentoml/repository/IrisClassifier/20211125103844_A5FBDE

$ bentoml list
BENTO_SERVICE                         AGE            APIS                                   ARTIFACTS                    LABELS
IrisClassifier:20211125103844_A5FBDE  20.54 seconds  predict<DataframeInput:DefaultOutput>  model<SklearnModelArtifact>

4. Spin up containers

$ saved_path=$(bentoml get IrisClassifier:latest --print-location --quiet)
$ docker build --no-cache -t iris-classifier $saved_path
$ docker run -p 5000:5000 --rm --name bentoml-classifier iris-classifier:latest

5. Test successfully 500 errors.

Execute CURL command x3 times.

$ curl -X POST "http://localhost:5000/predict" -H "accept: */*" -H "Content-Type: application/json" -d '{}'

Visit http://localhost:5000/ click /metrics > Try it out > Execute. You will see that CURL request appears 3.0 times in /metics log as 500 error.

BENTOML_IrisClassifier_request_duration_seconds_count{endpoint="/predict",http_response_code="500",service_version="20211125103844_A5FBDE"} 3.0

6. Test unsuccessfully 500 errors.

Execute CURL command x4 time.

$ curl -X POST "http://localhost:5000/predict" -H "accept: */*" -H "Content-Type: application/json" -d '[["5.1", "3.5", "1.4", "0.2xyz"]]'

Visit http://localhost:5000/ click /metrics > Try it out > Execute. You will see that CURL request appears 4.0 times in /metics log as 200 success !!!

BENTOML_IrisClassifier_request_duration_seconds_count{endpoint="/predict",http_response_code="200",service_version="20211125103844_A5FBDE"} 4.0

Investigate log of bentoml-classifier container. You will see unhandled exception!

$ docker logs bentoml-classifier

[2021-11-25 09:51:24,510] ERROR - Error caught in API function:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/bentoml/service/inference_api.py", line 176, in wrapped_func
    return self._user_func(*args, **kwargs)
  File "/home/bentoml/bundle/./IrisClassifier/iris_classifier.py", line 21, in predict
    return self.artifacts.model.predict(df)
  File "/opt/conda/lib/python3.8/site-packages/sklearn/svm/_base.py", line 791, in predict
    y = super().predict(X)
  File "/opt/conda/lib/python3.8/site-packages/sklearn/svm/_base.py", line 414, in predict
    X = self._validate_for_predict(X)
  File "/opt/conda/lib/python3.8/site-packages/sklearn/svm/_base.py", line 592, in _validate_for_predict
    X = self._validate_data(
  File "/opt/conda/lib/python3.8/site-packages/sklearn/base.py", line 561, in _validate_data
    X = check_array(X, **check_params)
  File "/opt/conda/lib/python3.8/site-packages/sklearn/utils/validation.py", line 738, in check_array
    array = np.asarray(array, order=order, dtype=dtype)
  File "/opt/conda/lib/python3.8/site-packages/pandas/core/generic.py", line 1993, in __array__
    return np.asarray(self._values, dtype=dtype)
ValueError: could not convert string to float: '0.2xyz'

Expected behaviour

Log messages with 500 error should be sent to /metrics 7.0 times. That is not the case!

Environment:

aarnphm commented 2 years ago

hi, this seems like there is a problem with your code. Can you send your service definition in here?

ldynia commented 2 years ago

@aarnphm I have updated issued with detailed step-by-step instructions how to reproduce this bug. Please test it, and you will see that you will arrive to the same conclusions.

ldynia commented 2 years ago

@aarnphm can I get any update ?

smiraldr commented 2 years ago

Hi ! is there any resolution regarding this issue? Is this going to be fixed @parano ?

parano commented 2 years ago

Thank you for reporting the issue @ldynia @smiraldr - this is likely due to an issue with python prometheus client's multi process support. In the upcoming BentoML 1.0, we have revamped that implementation and made sure metrics are collected correctly across multiple workers https://github.com/bentoml/BentoML/issues/2163. cc @bojiang