lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
https://lifeiteng.github.io/valle/index.html
Apache License 2.0
1.98k stars 320 forks source link

Question about loss calculation in AR model without handling mask #185

Open hertz-pj opened 3 months ago

hertz-pj commented 3 months ago

Regarding the loss calculation part of the AR model, why isn't the mask being handled?

total_loss = F.cross_entropy(logits, targets, reduction=reduction)

Normally, shouldn't it be:

total_loss = F.cross_entropy(logits.mask_selected(y_mask), targets.mask_selected(y_mask), reduction=reduction)

What's the reason for not considering the mask?

oush7 commented 1 month ago

Regarding the loss calculation part of the AR model, why isn't the mask being handled?

total_loss = F.cross_entropy(logits, targets, reduction=reduction)

Normally, shouldn't it be:

total_loss = F.cross_entropy(logits.mask_selected(y_mask), targets.mask_selected(y_mask), reduction=reduction)

What's the reason for not considering the mask?

Hi, Did you understand why there is no mask in the ar loss?