Closed cjw414 closed 3 years ago
This is a bug caused by pytorch compatibility.
You can quickly fix it by causal_mask = causal_mask.byte()
.
I'll fix it later.
@hirofumi0810 Thanks for the quick reply!
By the way, I think there was a slight mistake in the comment
You can quickly fix it by
causal_mask = causal_mask.byte()
.
since the causal_mask
was array of uint8
as followed:
(Pdb) causal_mask
tensor([[[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0],
...
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1]]], device='cuda:0', dtype=torch.uint8)
So I temporarily fixed it with causal_mask = causal_mask.bool()
instead!
@Jungwon-Chang Thank you! I'll fix it soon.
Fixed by #232.
Hello hiro. First of all, thank you for your awesome project! It's helping me a lot!
Anyways, I was trying to train MMA decoder for librispeech, with the asr_conf
Howevrer, just as the model started to train, an error occured. The error log was as following:
There was no problem in training with MoCha attention, but such error occured for streaming MMA. As I mentioned in the previous issue #201, my environment settings are as followed
Is this an environment issue?
Also, do you have any recommendation among the streaming MMA configurations for the sake of training speed(fastest to train)