opendatahub-io / ai-edge

ODH integration with AI at the Edge usecases
Apache License 2.0
8 stars 15 forks source link

The bike-rentals-auto-ml image has 1.6 GiB #95

Open adelton opened 10 months ago

adelton commented 10 months ago

Unlike the tensorflow-housing image that has 120.6 MiB (at https://quay.io/repository/rhoai-edge/tensorflow-housing?tab=tags&tag=latest, and also reproduced in my setup), the bike-rentals-auto-ml reported size is 1.6 GiB, at https://quay.io/repository/rhoai-edge/bike-rentals-auto-ml?tab=tags and again in my setup -- see https://github.com/opendatahub-io/ai-edge/issues/93#issuecomment-1730827828 and https://github.com/opendatahub-io/ai-edge/issues/93#issuecomment-1730835827.

Assuming both just ship a runtime without bundling the AI model, there seems to be some serious bloat in bike-rentals-auto-ml.

The pipelines/README.md should likely explain why those container images differ by an order of magnitude in size, what is so special about the second one that it warrants the extra bits.

piotrpdev commented 10 months ago

Assuming both just ship a runtime without bundling the AI model, there seems to be some serious bloat in bike-rentals-auto-ml.

Can you elaborate? Both include the serving runtime and model.

The serving runtime image alone for bike-rentals is almost 900MB compressed: seldonio/mlserver:1.3.5-slim. It is large, but supports a wide range of models: https://github.com/SeldonIO/MLServer#inference-runtimes

https://github.com/opendatahub-io/ai-edge/blob/d2972aba80644e750a28595524e20f86943f3912/pipelines/containerfiles/Containerfile.seldonio.mlserver.mlflow#L28-L30

The bike-rentals image also contains all of the dependencies for the model:

https://github.com/opendatahub-io/ai-edge/blob/d2972aba80644e750a28595524e20f86943f3912/pipelines/containerfiles/Containerfile.seldonio.mlserver.mlflow#L16-L30

See #27 for seldonio/mlserver vs openvino/mlserver.

adelton commented 9 months ago

Our discussions typically revolved around the model being huge. In this case though, the model is small (54 MB) and the runtime is huge.

I wonder if we are able to demonstrate the behaviour with a runtime that might not be so generic but whose size would be more palatable.

piotrpdev commented 9 months ago

seldonio/mlserver and openvino/mlserver are popular and cover most of the model frameworks and types/flavors. If we're looking for the best compatibility using these along with MLflow is probably the best option.

We can take the KServe approach and write our own runtimes if we want many smaller runtimes instead of a couple of large ones ¯\(ツ)/¯.