Open astachowiczhabana opened 13 hours ago
Use torch.matmul instead of torch.baddbmm in GPTBigCodeAttention._attn for devices other than cpu. This allows for using significantly larger batch sizes in text generation with bigcode-related models.
Hi @libinta this commit is also required with next OH release
Use torch.matmul instead of torch.baddbmm in GPTBigCodeAttention._attn for devices other than cpu. This allows for using significantly larger batch sizes in text generation with bigcode-related models.