Closed oshindow closed 4 years ago
The inputs of Decoderlayer is a set of batch data or a single data ? So as the code in pytorch/mem_trandformer.py, 'qlen=h.size(0)'. But if the input is a batch, h.size(0) is not the qlen.
The inputs of Decoderlayer is a set of batch data or a single data ? So as the code in pytorch/mem_trandformer.py, 'qlen=h.size(0)'. But if the input is a batch, h.size(0) is not the qlen.