Open adriangonz opened 1 year ago
Heterogeneous workers would also be beneficial for the recently added support for online drift detectors (https://github.com/SeldonIO/MLServer/pull/1108), since these detectors must currently be run with parallel_workers = 0
.
Hello, Is there any estimate of when this issue would be addressed? Is there any intention to resolve it for version 1.4? Thanks.
It is unlikely this is going to be addressed in the next release as it stands. Do you have a particular usecase that requires it that you could share?
Hello, the main problem with the actual parallel workers is the memory consumption when loading all models. If we can redistribute the models on the workers this could be improved.
Support a Heterogeneous pool of workers with variable number of model replicas per worker