I am reading your code. And I found that, batch_speaker_tokens are not put on cuda. (The code is as below: )
batch_input_tokens, batch_labels, batch_speaker_tokens = data
batch_input_tokens, batch_labels = batch_input_tokens.cuda(), batch_labels.cuda()
However, I found the speaker_token is a list with only one Tensor. I am quite confused that, shall we remove the list, just use Tensor as batch_speaker_tokens? In this way, the speaker_tokens can be speed up by GPU.
By the way, I am wondering why you put a list outside the Tensor in batch_speaker_tokens. I think there might be a reason for doing so.
I am reading your code. And I found that, batch_speaker_tokens are not put on cuda. (The code is as below: )
However, I found the speaker_token is a list with only one Tensor. I am quite confused that, shall we remove the list, just use Tensor as batch_speaker_tokens? In this way, the speaker_tokens can be speed up by GPU.
By the way, I am wondering why you put a list outside the Tensor in batch_speaker_tokens. I think there might be a reason for doing so.
Many thanks.