lucidrains / DALLE2-pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
MIT License
11.17k stars 1.09k forks source link

Fix potential dtype conversion bug #254

Closed JiaHeng-DLUT closed 2 years ago

JiaHeng-DLUT commented 2 years ago

The return value will be converted to 0 when x is an integer data type.

lucidrains commented 2 years ago

@JiaHeng-DLUT ohh, is there a place where it would be an int type? i think it is type converted to be float here https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L2155

JiaHeng-DLUT commented 2 years ago

Yes. Here is the traceback: https://github.com/lucidrains/DALLE2-pytorch/blob/HEAD/dalle2_pytorch/dalle2_pytorch.py#L1098 https://github.com/lucidrains/DALLE2-pytorch/blob/HEAD/dalle2_pytorch/dalle2_pytorch.py#L1355 https://github.com/lucidrains/DALLE2-pytorch/blob/HEAD/dalle2_pytorch/dalle2_pytorch.py#L1463 https://github.com/lucidrains/DALLE2-pytorch/blob/HEAD/dalle2_pytorch/dalle2_pytorch.py#L1455

There is a conflict at https://github.com/lucidrains/DALLE2-pytorch/blob/HEAD/dalle2_pytorch/dalle2_pytorch.py#L972. nn.Embedding needs the input to be int or long, while SinusoidalPosEmb needs the input to be float. I recommend removing the explicit data type conversion to support int. Also, the type will be converted implicitly when the input is float.

lucidrains commented 2 years ago

@JiaHeng-DLUT yes you are right! :pray: this was a big bug actually, and should be resolved in https://github.com/lucidrains/DALLE2-pytorch/commit/41fabf29220d0469af9c2681068e2ef99caa0085 thank you for uncovering this!