Open yjucho1 opened 2 months ago
Hello
thank you for your valuable research.
when I reproduce your research, I get the blow error. I change random seed to 99, and dataset is weather.
can you help me to handle this error?
thanks.
` `` 1 >>>>>>>start training : 96_96_PathFormer_ftweather_slM_pl96_96>>>>>>>>>>>>>>>>>>>>>>>>>> 2 train 36696 3 val 5175 4 test 10444 5 Traceback (most recent call last): 6 File "/data2/yunjucho/pathformer/run.py", line 123, in 7 exp.train(setting) 8 File "/data2/yunjucho/pathformer/exp/exp_main.py", line 148, in train 9 outputs, balance_loss = self.model(batch_x) 10 File "/data2/yunjucho/.conda/envs/lisa_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl 11 return forward_call(*args, *kwargs) 12 File "/data2/yunjucho/pathformer/models/PathFormer.py", line 56, in forward 13 out, aux_loss = layer(out) 14 File "/data2/yunjucho/.conda/envs/lisa_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl 15 return forward_call(args, **kwargs) 16 File "/data2/yunjucho/pathformer/layers/AMS.py", line 104, in forward 17 gates, load = self.noisy_top_k_gating(new_x, self.training) 18 File "/data2/yunjucho/pathformer/layers/AMS.py", line 95, in noisy_top_k_gating 19 load = (self._prob_in_top_k(clean_logits, noisy_logits, noise_stddev, top_logits)).sum(0) 20 File "/data2/yunjucho/pathformer/layers/AMS.py", line 62, in _prob_in_top_k 21 prob_if_in = normal.cdf((clean_values - threshold_if_in) / noise_stddev) 22 File "/data2/yunjucho/.conda/envs/lisa_env/lib/python3.10/site-packages/torch/distributions/normal.py", line 87, in cdf 23 self._validate_sample(value) 24 File "/data2/yunjucho/.conda/envs/lisa_env/lib/python3.10/site-packages/torch/distributions/distribution.py", line 300, in _validate_sample 25 raise ValueError( 26 ValueError: Expected value argument (Tensor of shape (256, 4)) to be within the support (Real()) of the distribution Normal(loc: tensor([0.], device='cuda:1'), scale: tensor([1.], device='cuda:1')), but found invalid values: 27 tensor([[nan, nan, nan, nan], 28 [nan, nan, nan, nan], 29 [nan, nan, nan, nan], 30 ..., 31 [nan, nan, nan, nan], 32 [nan, nan, nan, nan], 33 [nan, nan, nan, nan]], device='cuda:1', grad_fn=)
Have you solved this problem
Hello
thank you for your valuable research.
when I reproduce your research, I get the blow error. I change random seed to 99, and dataset is weather.
can you help me to handle this error?
thanks.
` `` 1 >>>>>>>start training : 96_96_PathFormer_ftweather_slM_pl96_96>>>>>>>>>>>>>>>>>>>>>>>>>> 2 train 36696 3 val 5175 4 test 10444 5 Traceback (most recent call last): 6 File "/data2/yunjucho/pathformer/run.py", line 123, in
7 exp.train(setting)
8 File "/data2/yunjucho/pathformer/exp/exp_main.py", line 148, in train
9 outputs, balance_loss = self.model(batch_x)
10 File "/data2/yunjucho/.conda/envs/lisa_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
11 return forward_call(*args, *kwargs)
12 File "/data2/yunjucho/pathformer/models/PathFormer.py", line 56, in forward
13 out, aux_loss = layer(out)
14 File "/data2/yunjucho/.conda/envs/lisa_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
15 return forward_call(args, **kwargs)
16 File "/data2/yunjucho/pathformer/layers/AMS.py", line 104, in forward
17 gates, load = self.noisy_top_k_gating(new_x, self.training)
18 File "/data2/yunjucho/pathformer/layers/AMS.py", line 95, in noisy_top_k_gating
19 load = (self._prob_in_top_k(clean_logits, noisy_logits, noise_stddev, top_logits)).sum(0)
20 File "/data2/yunjucho/pathformer/layers/AMS.py", line 62, in _prob_in_top_k
21 prob_if_in = normal.cdf((clean_values - threshold_if_in) / noise_stddev)
22 File "/data2/yunjucho/.conda/envs/lisa_env/lib/python3.10/site-packages/torch/distributions/normal.py", line 87, in cdf
23 self._validate_sample(value)
24 File "/data2/yunjucho/.conda/envs/lisa_env/lib/python3.10/site-packages/torch/distributions/distribution.py", line 300, in _validate_sample
25 raise ValueError(
26 ValueError: Expected value argument (Tensor of shape (256, 4)) to be within the support (Real()) of the distribution Normal(loc: tensor([0.], device='cuda:1'), scale: tensor([1.], device='cuda:1')), but found invalid values:
27 tensor([[nan, nan, nan, nan],
28 [nan, nan, nan, nan],
29 [nan, nan, nan, nan],
30 ...,
31 [nan, nan, nan, nan],
32 [nan, nan, nan, nan],
33 [nan, nan, nan, nan]], device='cuda:1', grad_fn=)