I am having an issue with high loss value. I was able to start the training process and notice that the loss value is very high (3000+ with alpha = 1 ) and (1000+ with alpha = 0.5) as opposed to 10+ mentioned in the paper. Initially, i thought the high loss values only happens in the first few epochs. However, that is not the case. The loss value is still high (at least compared to the 15+ mentioned in the paper).
I suspect that there could be something wrong with the input or maybe the loss calculation. I would really appreciate it if anyone could help me with this.
I am having an issue with high loss value. I was able to start the training process and notice that the loss value is very high (3000+ with alpha = 1 ) and (1000+ with alpha = 0.5) as opposed to 10+ mentioned in the paper. Initially, i thought the high loss values only happens in the first few epochs. However, that is not the case. The loss value is still high (at least compared to the 15+ mentioned in the paper).
I suspect that there could be something wrong with the input or maybe the loss calculation. I would really appreciate it if anyone could help me with this.