Closed pncnmnp closed 4 years ago
Changing this line (in serving.py) to - @bentoml.env(infer_pip_packages=True)
did fix the dependency issue. However, the 503 error still persists.
Looking again into Heroku's log files, I found a new issue:
2020-09-27T07:30:27.520399+00:00 app[web.1]: File "/bento/PyTorchModel/serving.py", line 14, in
2020-09-27T07:30:27.520399+00:00 app[web.1]: from config import config 2020-09-27T07:30:27.520400+00:00 app[web.1]: File "/bento/PyTorchModel/config.py", line 3, in 2020-09-27T07:30:27.520400+00:00 app[web.1]: class config: 2020-09-27T07:30:27.520401+00:00 app[web.1]: File "/bento/PyTorchModel/config.py", line 13, in config 2020-09-27T07:30:27.520401+00:00 app[web.1]: do_lower_case=True 2020-09-27T07:30:27.520401+00:00 app[web.1]: File "/opt/conda/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1425, in from_pretrained 2020-09-27T07:30:27.520402+00:00 app[web.1]: return cls._from_pretrained(*inputs, **kwargs) 2020-09-27T07:30:27.520402+00:00 app[web.1]: File "/opt/conda/lib/python3.6/site-packages/transformers/tokenization_utils_base.py", line 1531, in _from_pretrained 2020-09-27T07:30:27.520403+00:00 app[web.1]: list(cls.vocab_files_names.values()), 2020-09-27T07:30:27.520404+00:00 app[web.1]: OSError: Model name '../model/' was not found in tokenizers model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, TurkuNLP/bert-base-finnish-cased-v1, TurkuNLP/bert-base-finnish-uncased-v1, wietsedv/bert-base-dutch-cased). We assumed '../model/' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.txt'] but couldn't find such vocabulary files at this path or url.
The issue is the config.py
file and Heroku is unable to find the ../model
directory and its contents. Checking the local copy of my BentoML model which got deployed as a docker image on Heroku, I can see the following file structure:
[13:06][20200926223007_B35B21] tree . . ├── bentoml-init.sh ├── bentoml.yml ├── docker-entrypoint.sh ├── Dockerfile ├── environment.yml ├── MANIFEST.in ├── python_version ├── PyTorchModel │ ├── artifacts │ │ ├── init.py │ │ └── ner.pt │ ├── bentoml.yml │ ├── config.py │ ├── dataset.py │ ├── init.py │ ├── model.py │ ├── serving.py │ └── utils.py ├── README.md ├── requirements.txt └── setup.py 2 directories, 19 files [13:06][20200926223007_B35B21]
It has not loaded the model
directory and its 3 contents - config.json, pytorch_model.bin, and vocab.txt
, although it was there on my local repository (which was saved by BentoML). I am unsure if we have to change the file structure or provide a special provision on BentoML.
The above error was fixed by moving the model
directory to ./src/
and renaming it to bert-base-uncased
. The final path was /server/src/bert-base-uncased
instead of /server/model/
. (Also change the path name in config.py
)
I am not sure why it worked. I assumed it was looking for a directory from a word-list and I provided it a name relevant to our model from the same list.
However, fixing this caused another issue -
Traceback (most recent call last): File "/opt/conda/lib/python3.6/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker worker.init_process() File "/opt/conda/lib/python3.6/site-packages/gunicorn/workers/base.py", line 119, in init_process self.load_wsgi() File "/opt/conda/lib/python3.6/site-packages/gunicorn/workers/base.py", line 144, in load_wsgi self.wsgi = self.app.wsgi() File "/opt/conda/lib/python3.6/site-packages/gunicorn/app/base.py", line 67, in wsgi self.callable = self.load() File "/opt/conda/lib/python3.6/site-packages/bentoml/server/gunicorn_server.py", line 94, in load bento_service = load(self.bento_service_bundle_path) File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/loader.py", line 251, in load svc_cls = load_bento_service_class(bundle_path) File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/loader.py", line 191, in load_bento_service_class spec.loader.exec_module(module) File "
", line 678, in exec_module File " ", line 219, in _call_with_frames_removed File "/bento/PyTorchModel/serving.py", line 22, in meta_data = joblib.load("meta.bin") File "/opt/conda/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 597, in load with open(filename, 'rb') as f: FileNotFoundError: [Errno 2] No such file or directory: 'meta.bin'
I looked into the generated Bento (latest) directory and sure enough, there was no such file as meta.bin
, which is strange considering it was there in the codebase (./server/src/meta.bin
). Adding it in the latest Bento, fixed the above issue.
However, a last error popped out: (Now we are starting to go down a rabbit hole :smile:)
Traceback (most recent call last): File "/opt/conda/lib/python3.6/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker worker.init_process() File "/opt/conda/lib/python3.6/site-packages/gunicorn/workers/base.py", line 119, in init_process self.load_wsgi() File "/opt/conda/lib/python3.6/site-packages/gunicorn/workers/base.py", line 144, in load_wsgi self.wsgi = self.app.wsgi() File "/opt/conda/lib/python3.6/site-packages/gunicorn/app/base.py", line 67, in wsgi self.callable = self.load() File "/opt/conda/lib/python3.6/site-packages/bentoml/server/gunicorn_server.py", line 94, in load bento_service = load(self.bento_service_bundle_path) File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/loader.py", line 251, in load svc_cls = load_bento_service_class(bundle_path) File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/loader.py", line 191, in load_bento_service_class spec.loader.exec_module(module) File "
", line 678, in exec_module File " ", line 219, in _call_with_frames_removed File "/bento/PyTorchModel/serving.py", line 22, in meta_data = joblib.load("meta.bin") File "/opt/conda/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 605, in load obj = _unpickle(fobj, filename, mmap_mode) File "/opt/conda/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 529, in _unpickle obj = unpickler.load() File "/opt/conda/lib/python3.6/pickle.py", line 1050, in load dispatch[key[0]] (self) File "/opt/conda/lib/python3.6/pickle.py", line 1338, in load_global klass = self.find_class(module, name) File "/opt/conda/lib/python3.6/pickle.py", line 1388, in find_class import(module, level=0) ModuleNotFoundError: No module named 'sklearn' [2020-09-27 14:56:28 +0000] [12] [INFO] Worker exiting (pid: 12) [2020-09-27 14:56:30 +0000] [1] [INFO] Shutting down: Master
After adding scikit-learn==0.22
to the requirements.txt
file, it fixed this issue.
Thankfully, no other issue appeared and the docker image was working fine on my local machine.
Finally, no 503s!
The deployment on Heroku was a bumpy ride. Initially I encountered the following error:
I noticed a similar error for
Torch==1.6.0+cpu
. As I was not using CUDA (#2), The following thread on Stack Overflow - Install PyTorch from requirements.txt fixed the issue.At this stage BentoML's
requirements.txt
for my latest model was in the following form:The deployment went fine without any issues. However, after making a prediction request:
I was getting a 503 Error. From Heroku's Log:
2020-09-26T15:37:11.745901+00:00 heroku[router]: at=error code=H10 desc="App crashed" method=POST path="/predict" host=bentoml-her0ku-mtywmteynzm1mgo.herokuapp.com request_id=a971ef08-e573-487a-9e23-2f0f5ba0e87a fwd="49.32.63.210" dyno= connect= service= status=503 bytes= protocol=https
Heroku Docs says the following about
H10 error
- A crashed web dyno or a boot timeout on the web dyno will present this error.Inspecting the log file, I found the issue:
Transformers module was not installed. I believe this is due to the following line-
@bentoml.env(pip_packages=['torch', 'numpy', 'torchvision', 'scikit-learn'])
BentoML is being explicitly asked to install the following four packages and not Transformers.
@bentoml.env(infer_pip_packages=True)
should fix the issue.