tenstorrent / tt-tvm

TVM for Tenstorrent ASICs
Apache License 2.0
18 stars 6 forks source link

Add support for diagonal parameter in tril function #17

Closed kamalrajkannan78 closed 1 month ago

kamalrajkannan78 commented 1 month ago
if self.sliding_window is not None:
   diagonal = past_key_values_length - self.sliding_window - 1
   context_mask = torch.tril(torch.ones_like(mask, dtype=torch.bool), diagonal=diagonal)
   mask.masked_fill_(context_mask, torch.finfo(dtype).min)