Closed ylic204 closed 2 years ago
Hi, I believe this is due to the bias not being initialized in both the feature and temporal attention layer. torch.empty()
is initialized with trash values which can be NaNs as well.
Try adding zero initialization for bias in both modules, this is at least how it is done in Torch Geometric (see reset_parameters()
method).
if self.use_bias:
self.bias = nn.Parameter(torch.empty(window_size, window_size))
nn.init.zeros_(self.bias.data)
# or you can do this, which I believe will yield the same result
if self.use_bias:
self.bias = nn.Parameter(torch.zeros(window_size, window_size))
Hope this works!
It works for me now! Thank you !
Dear authors,
Thank you for uploading this code. I am a beginner in multivariate time series anomaly detection and this has been very helpful in my research. I have read and understood your code, but the output is always nan when training. And i can be sure that the data input is normal.
Therefore, I output the result of each step in
forward()
in mtad_gat.py. Then, after thefeature_gat()
layer of operation, there is a problem.So I step into
feature_gat()
, aftere = torch.matmul(a_input, self.a).squeeze(3)
, somenan
appears, as shown in the figure. Then aftersoftmax
there are morenan
s, usually one column isnan
.I wonder how to solve this problem? I also tried to adjust
batch_size
,look_back
, but nothing works.Environment: