Closed houghtonweihu closed 7 months ago
In your file: https://github.com/johnma2006/mamba-minimal/blob/master/model.py, you have:
A = repeat(torch.arange(1, args.d_state + 1), 'n -> d n', d=args.d_inner) self.A_log = nn.Parameter(torch.log(A)) A = -torch.exp(self.A_log.float()) # shape (d_in, n)
Does matrix A need to be diagonal?
Thanks!
I got an answer from Dr Dao: d_in is actually the batch dim.
what is the batch dim?
In your file: https://github.com/johnma2006/mamba-minimal/blob/master/model.py, you have:
A = repeat(torch.arange(1, args.d_state + 1), 'n -> d n', d=args.d_inner) self.A_log = nn.Parameter(torch.log(A)) A = -torch.exp(self.A_log.float()) # shape (d_in, n)
Does matrix A need to be diagonal?
Thanks!