issues
search
deepjavalibrary
/
djl-serving
A universal scalable machine learning model deployment solution
Apache License 2.0
182
stars
59
forks
source link
[python] refactor rolling batch inference method
#2090
Closed
sindhuvahinis
closed
2 weeks ago
sindhuvahinis
commented
2 weeks ago
Description
Our handlers calls the rolling_batch.inference and constructs the output, but our handlers share the same code, so there are some code duplication
So extracted the duplicated code to utils.py.
In huggingface.py, inference method is too long, so modularized them.
There are some further more refactor will be done next week, which will clean up these method signatures further better.
Description