The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
When running bentoml serve after building my bento I get the following error:
Traceback (most recent call last):
File "/home/tom/Desktop/ml-reconciliation/venv/bin/bentoml", line 8, in <module>
sys.exit(cli())
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml_cli/utils.py", line 362, in wrapper
return func(*args, **kwargs)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml_cli/utils.py", line 333, in wrapper
return_value = func(*args, **kwargs)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml_cli/utils.py", line 290, in wrapper
return func(*args, **kwargs)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml_cli/env_manager.py", line 122, in wrapper
return func(*args, **kwargs)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml_cli/serve.py", line 260, in serve
serve_http_production(
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/simple_di/__init__.py", line 139, in _
return func(*_inject_args(bind.args), **_inject_kwargs(bind.kwargs))
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml/serve.py", line 327, in serve_http_production
json.dumps(runner.scheduled_worker_env_map),
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 356, in scheduled_worker_env_map
for worker_id in range(self.scheduled_worker_count)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 341, in scheduled_worker_count
return self.scheduling_strategy.get_worker_count(
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml/_internal/runner/strategy.py", line 68, in get_worker_count
resource_request = system_resources()
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml/_internal/resource.py", line 46, in system_resources
res[resource_kind] = resource.from_system()
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/bentoml/_internal/resource.py", line 248, in from_system
pynvml.nvmlInit()
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/pynvml/nvml.py", line 1770, in nvmlInit
nvmlInitWithFlags(0)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/pynvml/nvml.py", line 1760, in nvmlInitWithFlags
_nvmlCheckReturn(ret)
File "/home/tom/Desktop/ml-reconciliation/venv/lib/python3.10/site-packages/pynvml/nvml.py", line 833, in _nvmlCheckReturn
raise NVMLError(ret)
pynvml.nvml.NVMLError_DriverNotLoaded: Driver Not Loaded
Describe the bug
When running
bentoml serve
after building my bento I get the following error:To reproduce
bentoml serve
Expected behavior
bentoml serve
running without errorEnvironment
BentoML: 1.1.11 Python: 3.10 torch: 2.2.1 Ubuntu: 22.04 no Nvidia GPU