Open Atry opened 5 months ago
Note that this bug is about DeepSpeed HE, not DeepSpeed Chat. I reported it as deepspeed-chat label because there is not deepspeed-he label
Also note that this bug is only visible when #5398 is fixed, therefore I applied #5624 as a monkey patch to reproduce this bug.
Describe the bug
I got the error
RuntimeError: The expanded size of the tensor (2048) must match the existing size (1179648) at non-singleton dimension 1. Target sizes: [2048, 2048]. Tensor sizes: [1179648]
when trying to rundeepspeed_hybrid_engine.generate
when the DeepSpeedHybridEngine is initialized with 4-bit quantization.Log output
See https://gist.github.com/Atry/4ebf4e6208a2a3628f65c85a40f9c49d
To Reproduce Steps to reproduce the behavior: Run the following Python script:
Expected behavior No error
ds_report output
Screenshots Not applicable
System info (please complete the following information):
Docker context Not using Docker
Additional context