Closed MokkeMeguru closed 4 years ago
slogdet
is exactly the right function, but if the det of the weight is always positive, logdet
is also okay .
I initialized the det of the weight to be positive as follows: https://github.com/jaywalnut310/glow-tts/blob/00a482d06ebbffbd3518a43480cd79e7b47ebbe2/modules.py#L200-L203
Because the training process (maximizing log likelihood of data) encourages the determinant to increase, the det of the weight tends to maintain positivity.
The main reason I prefer logdet
is that it's easier and more stable to calculate than slogdet
.
Actually, I followed the implementation of WaveGlow:
https://github.com/NVIDIA/waveglow/blob/master/glow.py#L76-L80
https://github.com/NVIDIA/waveglow/blob/d18e0f3cc2ff6bdd41244d7391140accdc41142b/glow.py#L100
Thanks your information ! I didn't know positive weight initialization because I didn't see WaveGlow's implementation.
Can you show me the source reference that slogdet is more stable than logdet ?
The main reason I prefer
logdet
is that it's easier and more stable to calculate thanslogdet
.
I misunderstood that logdet
is easier to calculate than slogdet
, because logdet
of positive definite matrices is easily formulated. For arbitrary matrices, the two functions are implemented similarly except the sign info. Sorry for my misunderstanding, @MokkeMeguru.
However, for checking training stability, logdet
can be a good tool.
Although the determinant of weights are initialized to be positive and the training process encourages the determinant to increase, the determinant can fluctuate positive/negative across zero due to some bad training configurations.
In that case, logdet
would make errors that alert you did something wrong, while slogdet
wouldn't.
rafaelvalle's mentions in waveglow issues would be helpful: https://github.com/NVIDIA/waveglow/issues/49#issuecomment-442522418 https://github.com/NVIDIA/waveglow/issues/35#issuecomment-442523005
slogdet vs logdet Oh... PyTorch uses the same formulation. In Tensorflow, the logdet uses Choreskey method, but the slogdet uses LU decomposition. There is a complex issue...
logdet's flipping We know log_det_jacobian term is not same learning curve compared with negative log-likelihood. (Here is my glow's training with Oxford 102 flower dataset.) So I wonder it is not correct that log_det_jacobian is always increasing... (but this is not your code and problem settings, I will monitored in your problem.)
Anyway, thanks for your help and reply.
So, this log det jacobian is
torch.slogdet(self.weight)
https://github.com/jaywalnut310/glow-tts/blob/00a482d06ebbffbd3518a43480cd79e7b47ebbe2/modules.py#L228