Open gvijqb opened 1 year ago
@gvijqb I believe you are seeing this problem when changing batch size because stable diffusion uses cuda graphs in the MII deployment (see https://github.com/microsoft/DeepSpeed-MII/blob/4040daecc4185e591d25cffc07b9cfd6de9f4fb7/mii/models/load_models.py#L75). DeepSpeed-inference captures the graph only once for a model, using the initially provided batch size. As a result, changing the batch size will give the error you are seeing. We are working on a solution that allows capturing multiple graph replays for different batch sizes.
Understood, thanks @mrwyattii . Is there any idea on when I can expect the solution to be live?
@cmikeh2 started work on this some time ago: https://github.com/microsoft/DeepSpeed/pull/2458
I'll work on this and try to get it merged sometime next week!
Hi team,
I am facing difficulty in changing batch size in my query on a mii deployment.
For example, if I send batch size 1 I get an image but if I change the batch size to 4 or any other number then I get the following exception:
Exception calling application: output with shape [1, 77] doesn't match the broadcast shape [4, 77]
I get a similar exception on just using deepspeed with stable diffusion model on changing parameter values such as width, height and batch size. In direct deepspeed usage, I get this exception on changing parameters:
The size of tensor a (4) must match the size of tensor b (2) at non-singleton dimension 0
Without being able to change parameters quickly, I am unable to deploy it as in practical scenario I am bound to use a variety of values for each of these parameters.
Looking for a solution on this.