nod-ai / SHARK-Studio

SHARK Studio -- Web UI for SHARK+IREE High Performance Machine Learning Distribution
Apache License 2.0
1.42k stars 171 forks source link

Make batch size configurable #636

Open mariecwhite opened 1 year ago

mariecwhite commented 1 year ago

The SHARK tank hardcodes batch size to 1. It would be great if this were configurable since many server workloads are batched.

RIPBon commented 1 year ago

Yes, that could be very helpful for the work in it.

sogartar commented 1 year ago

Some models have a dynamic input https://github.com/nod-ai/SHARK/blob/main/tank/model_metadata.csv. You can download the dynamic version of the model https://github.com/nod-ai/SHARK/blob/a14a47af121b07b4882231f5907d34ca986c58e0/shark/shark_downloader.py#L129. Then by convention the first (index 0) dimension of the input is the batch size and it is dynamic. This may not bring the best performance though. Ideally you would want to set the batch size during compilation to enable better optimization.

mariecwhite commented 1 year ago

For models that only work with static shapes (Tensorflow), I've created a fork with necessary changes and instructions on how to regenerate the model artifacts: https://gist.github.com/mariecwhite/7127c73415d5a61f0927781ad3a2e572

powderluv commented 1 year ago

@monorimet @dan-garvey FYI. Thank you @mariecwhite