Closed Toukenize closed 2 years ago
Thanks for reporting this @Toukenize, this is because BentoML currently uses Python's own modulefinder
module to find local dependencies, and it only recognizes a folder as an importable module if it contains an __init__.py
file. But I think it makes sense to support bundling the import even if the folder does not contain an __init__.py
file, I will look into it.
Hello,
Think it's related. This is using bentoml==0.11.0
, and bentoml serve[-gunicorn]
(=not the docker wrapper but direct run).
I have my package (let's call it foo
) pip-installed locally (pip install -e .
).
One class in this package is the service:
# foo/service.py
from bentoml import BentoService
class FooService(BentoService):
This class uses lots of imports from this package.
When I service.save_to_dir(path)
, under path I get
bentoml.yml
FooService/bentoml.yml
These two .yml
files are identical and contain a reference
metadata
module_name: foo.foo_service
module_file: foo/foo_service.py
But the only part of foo
copied is foo_service.py
(__init__.py
files are added along the path).
So I think I'm seeing what @Toukenize sees: that only the file containing the service definition is copied, and not the entire package.
This happened to me on bentoml==0.9.1
as well, and both when foo
was and wasn't pip installed (I uninstalled and set my PYTHONPATH
to make it work without being installed).
Two workarounds:
bentoml.yml
files to point module_file
to the exact path to the service module;saved_bundle/loader.py
to not give precedence to the packed module: # sys.path.insert(0, bundle_path)
# sys.path.insert(0, os.path.join(bundle_path, metadata["service_name"]))
sys.path.append(bundle_path)
sys.path.append(os.path.join(bundle_path, metadata["service_name"]))
and export PYTHONSTARTUP=/path/to/my/package
, then the package that the service is a part of loads properly.
@guy4261 two identical bentoml.yml files are expected behavior - it is necessary for making BentoService bundle "pip installable".
Did you add an __init__.py
file to the directory containing your python code? If so, they should be copied to the bundle when calling save
or save_to_dir
. BentoML does not copy the "entire package", but only bundles the python modules that are imported and used in your BentoService class. Without the __init__.py
file, python does not recognize it as a module.
(note: I edited my previous reply so filenames will match this comment.)
two identical bentoml.yml files are expected behavior
I assumed that's OK, just wanted to note (in case others will take a look).
Did you add an
__init__.py
file to the directory containing your python code?
So I have my git repo for this package, with a setup.py
at the repo root.
The package name is foo
, so there's a directory foo
.
The directory structure is that of a Python package:
(git_root)
.git/
setup.py
foo/
__init__.py
service/
__init__.py
foo_service.py
pack/
__init__.py
pack_script.py
As I said, only foo/service/foo_service.py
is copied. It has imports such as from foo import ...
(there are other subpackages there). But none is copied.
That's sad because eventually I run from an environment where foo
is installed; the service could've load it using the metadata.module_name
form the bento.yml
(if it didn't try to find the Python module using its path).
Does the foo_service.py
file contains your BentoService
class definition? And does the foo_service.py
file import from the pack_script.py
? If that's the case, the pack_script.py
is expected to be copied to the saved bundle and maintaining your folder structure. All imports should work when loading the saved bundle from another environment and we have tests covering this behavior here.
And BentoML does load foo_service.py
based on the metadata.module_name
, here's related code: https://github.com/bentoml/BentoML/blob/v0.11.0/bentoml/saved_bundle/loader.py#L204
Does the foo_service.py file contains your BentoService class definition? Yep:
# foo/foo_service.py
from bentoml import BentoService
class FooService(BentoService):
Does the
foo_service.py
file import from thepack_script.py
No, vice-versa:pack_script.py
imports theFooService
class and packs it. Onlyfoo_service.py
is copied (along with the directory structure; i.e. it is not placed in the root of the bento but underfoo/foo_service.py
. But the rest of the package is not copied :(
I will look into those links - thanks 🙏
Np!
pack_script.py imports the FooService class and packs it.
If that's the case, FooService
itself does not rely on the pack_script.py
, why should it be copied? BentoML by default only packs files and modules that are necessary to run model inference with the BentoService class, thus only copies modules that are imported by the foo_service.py
file.
Indeed I don't need pack_script.py
. But I do need foo
. And out of the entire foo
package, only foo/foo_service.py
alone is copied :(
Although it makes imports to the rest of the package.
@guy4261 that sounds like an unexpected behavior, do you mean only foo/foo_service.py
is copied but the foo/__init__.py
file is not? Does your foo
service load properly after it's being saved, if not, what's the error message? And could you share a bit more about your project structure and source code if possible?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Thanks again for all the discussion, which really inspired us to redesign the Bento packaging API in BentoML version 1.0, which makes bundling local dependencies a lot easier.
In BentoML 1.0, a project root must be specifically defined by placing a bentofile.yaml
file in the directory. This file specifies all the Bento build configs, including what are the files to include in the final Bento built. The project root also should be seen as the CWD in your service's python environment, as well as part of the import path in sys.path
. A more detailed explanation can be found here: https://github.com/bentoml/BentoML/tree/main/bentoml/_internal/bento
Describe the bug
BentoService does not package local dependencies with more than 1 level of directory without adding
__init__.py
in all sub-directories.This is my project structure
BentoML only bundled these (and classifier.py is moved out):
These are what my
classifier.py
andserve.py
do:And I basically define the BentoService in
classifier.py
then I bundle it in
serve.py
To Reproduce
Set up a directory structure as what I described, then run
python serve.py
to bundle the BentoService.Expected behavior
The complete src folder should be bundled together as local dependencies, like this:
Screenshots/Logs
Environment:
Additional context
The only work around I have is to add
__init__.py
to every sub-folders of dependencies that need to be bundled by BentoML, but this is quite cumbersome, especially when we have many sub-folders of local dependencies.