Open Chris-Pedersen opened 2 years ago
Just to add a bit more on this, when running https://github.com/Chris-Pedersen/Wavelets/blob/main/playground/scattering_conv_playground.ipynb on JupyterHub on rusty, printing the wavelet parameters I get:
[tensor([4.3760, 1.7979, 1.4253, 3.4640, 4.5206, 2.6585, 6.1623, 4.3029, 3.0218,
2.4637, 2.1563, 4.5808, 2.7556, 0.3750, 2.5010, 4.6370]),
tensor([0.7154, 0.7468, 0.7129, 0.6561, 0.7132, 0.9467, 0.9721, 0.7509, 0.8120,
0.5578, 0.6586, 0.7074, 0.9332, 0.6252, 0.7415, 0.9928]),
tensor([4.3619, 4.5219, 3.0103, 4.8131, 4.5062, 4.4083, 3.9638, 3.8507, 4.1507,
4.6248, 4.8696, 4.3449, 4.6075, 4.4783, 4.5408, 4.6153]),
tensor([0.6825, 0.6755, 1.0316, 1.0318, 1.1344, 1.3494, 1.2245, 1.1110, 1.2224,
0.8230, 0.8618, 0.7283, 0.7937, 1.1310, 0.5921, 0.9337])]
and printing scatteringBase.psi[0][0]
gives
tensor([[[-1.2261e-08],
[ nan],
[-3.5012e-03],
...,
[ 1.1419e-02],
[ 6.5788e-03],
[ 2.8291e-03]],
[[-7.9257e-04],
[ nan],
[-3.9038e-03],
...,
Doing the same thing in the terminal, using the same conda environment, I get:
>>> scatteringBase.params_filters
[tensor([4.3760, 1.7979, 1.4253, 3.4640, 4.5206, 2.6585, 6.1623, 4.3029, 3.0218,
2.4637, 2.1563, 4.5808, 2.7556, 0.3750, 2.5010, 4.6370]), tensor([0.7154, 0.7468, 0.7129, 0.6561, 0.7132, 0.9467, 0.9721, 0.7509, 0.8120,
0.5578, 0.6586, 0.7074, 0.9332, 0.6252, 0.7415, 0.9928]), tensor([4.3619, 4.5219, 3.0103, 4.8131, 4.5062, 4.4083, 3.9638, 3.8507, 4.1507,
4.6248, 4.8696, 4.3449, 4.6075, 4.4783, 4.5408, 4.6153]), tensor([0.6825, 0.6755, 1.0316, 1.0318, 1.1344, 1.3494, 1.2245, 1.1110, 1.2224,
0.8230, 0.8618, 0.7283, 0.7937, 1.1310, 0.5921, 0.9337])]
>>> scatteringBase.scattering.psi[0][0]
tensor([[[-1.2644e-08],
[-2.0639e-03],
[-3.5013e-03],
...,
[ 1.1419e-02],
[ 6.5788e-03],
[ 2.8291e-03]],
[[-7.9257e-04],
[-2.6254e-03],
[-3.9038e-03],
...,
So somehow the filters that the code is producing are different, even though the parameters are the same.
Ok this is somehow fixed by the latest PR #26 so closing, perhaps some bug with the kernel
Reopening this as it's becoming a bit of an obstacle to running some tests. It occurs when using randomly initialised filters, and when using a conda environment. When running the same script:
[cpedersen@workergpu115 scripts]$ python3 sn_debug.py
CUDA Available
/mnt/home/cpedersen/.local/lib/python3.6/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
tensor([[[ 8.8521e-10+0.0000e+00j, 6.7210e-05+1.2745e-11j,
1.3836e-04+1.0620e-12j, ...,
-1.6191e-04+1.6484e-11j, -1.1581e-04+1.2596e-11j,
-6.1489e-05+1.3602e-11j],
[ 4.1143e-04-1.4684e-09j, 5.1359e-04-1.1640e-09j,
6.1729e-04-9.0628e-10j, ...,
1.3937e-04-2.3496e-09j, 2.2221e-04-2.0354e-09j,
3.1352e-04-1.7559e-09j],
[ 1.0348e-03-2.9451e-09j, 1.1884e-03-2.9773e-09j,
1.3399e-03-3.0218e-09j, ...,
5.9933e-04-2.6364e-09j, 7.3684e-04-2.7482e-09j,
8.8304e-04-2.8732e-09j],
...,
[-5.0493e-04-4.1458e-08j, -4.8694e-04-4.4042e-08j,
-4.6242e-04-4.6318e-08j, ...,
-5.1632e-04-3.2544e-08j, -5.1970e-04-3.5603e-08j,
-5.1593e-04-3.8625e-08j],
[-4.1962e-04-4.5405e-08j, -3.9140e-04-4.8363e-08j,
-3.5698e-04-5.0979e-08j, ...,
-4.6110e-04-3.5220e-08j, -4.5478e-04-3.8732e-08j,
-4.4090e-04-4.2151e-08j],
[-2.6202e-04-4.3522e-08j, -2.1824e-04-4.5917e-08j,
-1.6907e-04-4.8030e-08j, ...,
-3.5093e-04-3.5202e-08j, -3.2904e-04-3.8081e-08j,
-2.9923e-04-4.0869e-08j]],
(wavelet) [cpedersen@workergpu115 scripts]$ python3 sn_debug.py
CUDA Available
/mnt/home/cpedersen/miniconda3/envs/wavelet/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1640811757271/work/aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
tensor([[[nan+nanj, nan+nanj, nan+nanj, ..., nan+nanj, nan+nanj, nan+nanj],
[nan+nanj, nan+nanj, nan+nanj, ..., nan+nanj, nan+nanj, nan+nanj],
[nan+nanj, nan+nanj, nan+nanj, ..., nan+nanj, nan+nanj, nan+nanj],
So this is due to numpy
version, and to do with the random seed generation. Will fix this after the ICML submission, a workaround for now is to use numpy=1.19.5
, whereas my conda environment was using 1.21.2. Leaving this open as a reminder to fix the random initialisation code to not be so strongly dependent on numpy version.
Very odd - when running my notebooks such as
scattering_conv_playground.ipynb
on JupyterHub on rusty, the wavelets stored inscattering.psi
have NaNs dotted around them which obviously breaks all the convolutions. When running the exact same notebook on my desktop or laptop, I get no NaNs (same random seed, and I have verified that the wavelet parameters are the same). I have checked kymatio, python, numpy and torch versions are the same. I am running JupyterHub using a kernel constructed from my conda environment on rusty. When I rerun the same code block from the terminal in this environment, I also get no NaNs, so I don't think this is an environment issue.So just to be clear:
A bit at a loss on this one at the moment.. perhaps something unusual is being done when Jupyter imports the conda environment kernel, maybe the best option is to contact scc. @eickenberg have you ever encountered anything like this?