deepjavalibrary / djl-serving

A universal scalable machine learning model deployment solution
Apache License 2.0
182 stars 59 forks source link

[fix] align default neuron behavior between model server and handler #2116

Closed tosterberg closed 1 week ago

tosterberg commented 1 week ago

Description

Updating the behavior of LmiConfigRecommender to match the behavior of the default properties when using a serving.properties for NeuronX containers. The recommender currently defaults all models to rolling batch, but this is not the desired behavior for LMI NeuronX due to the limited number of supported models for rolling batch.

This undoes a change that was a altered the default as aa side effect of the recommenders implementation but is not consistent in defaulting between launching with environmental variables and serving.properties for LMI NeuronX.