Open cornzz opened 3 months ago
Fixes https://github.com/mistralai/mistral-inference/issues/215
Attention bias was being created on cuda:0 regardless of the selected cuda device as the correct device was not being passed to from_seqlens() in BufferCache.get_input_metadata()
from_seqlens()
BufferCache.get_input_metadata()
Fixes https://github.com/mistralai/mistral-inference/issues/215
Attention bias was being created on cuda:0 regardless of the selected cuda device as the correct device was not being passed to
from_seqlens()
inBufferCache.get_input_metadata()