If there are multiple models in the model-repository, how will FT launch model instances? Say there are 4 GPUs in total, I launch the Bert and GPT model each with 1 instance, will they both be placed on the first GPU? Can I control the instance placement policy?
If there are multiple models in the model-repository, how will FT launch model instances? Say there are 4 GPUs in total, I launch the Bert and GPT model each with 1 instance, will they both be placed on the first GPU? Can I control the instance placement policy?