Closed wilson97 closed 11 months ago
@wilson97 @manmay-nakhashi Manmay actually pull requested that implementation, he would know better!
@wilson97 tried using Lbin mentioned in paper , but after using that convergence is slow , we can use it after some steps because it regularize alignment training.
@manmay-nakhashi maybe the loss can be added for completeness sake and defaulted to off? (loss weight of 0)
@lucidrains sure I'll add it.
@wilson97 @manmay-nakhashi hey, I had some time this morning and knocked out the loss
do you want to do a quick code review and see if it lines up with what you'd expect reading the paper
sure I can take a look
think this should be fixed
I was looking through the aligner code. In this paper: https://arxiv.org/pdf/2108.10447.pdf the NAR aligner loss has two terms: forward sum and KL divergence between soft and hard alignments. However in the code I only see the forward sum loss. Is there a reason for this, or am I missing something? Thanks.