Open puria-izady opened 5 years ago
I don't think you made any mistake.
So, for the warning:
Please include torch/extension.h
For the error, this has been asked a few times: https://github.com/pytorch/extension-cpp/issues?utf8=%E2%9C%93&q=is%3Aissue+fmax
I think the consensus was this is an environment error, and the best solution is to build PyTorch from source
no, it is because of cuda API. No relevance to Pytorch. just cast the second arg to (double). That's the best solution.
Got the same error here.
Ubuntu 16.04
Cuda 10.0
Pytorch 1.1.0a0+7e73783 (built from source)
python 3.7
although solution from #21 seems to work. Discussion from #15 also hints that casting to scalar_t might actually be the thing to do if numbers are implicitely cast to double.
Normally i would add the (scalar_t
) cast and move on, but I wanted to submit a PR (see #31) and cannot build on a clean workspace.
Any hints on what to do ? I actually could build before, (last summer) but since then, I updated my python version, along with cuda (and of course pytorch). I might try on a docker build to have a perfeclty clean install, but if the problem is common enough maybe we can add this cast on fmax
(and fmin
, everything to scalar_t
is better than everything to double
)
After some investigations, it seems related to gcc version. Originally tested it in gcc-7 but it didn't work. Changed to gcc-5 with a simple "update alternatives" and now it works. pytorch was compiled from source with gcc-7.
Any idea what might have changed from gcc-5 to gcc-7 ?
I reproduced this on docker today, and fixed the issue with this commit https://github.com/pytorch/extension-cpp/commit/1031028f3b048fdea84372f3b81498db53d64d98
Hi thanks for the commit ! unfortunately, I believe the fminf
and fmaxf
is implicitely casting everything to float32. As a consequence, the check.py
and grad_check.py
are now broken with cuda, because the precision is not sufficient for float64 tensors.
Example output :
python check.py forward -c
Forward: Baseline (Python) vs. C++ ... Ok
Forward: Baseline (Python) vs. CUDA ... Traceback (most recent call last):
File "check.py", line 104, in <module>
check_forward(variables, options.cuda, options.verbose)
File "check.py", line 45, in check_forward
check_equal(baseline_values, cuda_values, verbose)
File "check.py", line 22, in check_equal
np.testing.assert_allclose(x, y, err_msg="Index: {}".format(i))
File "/home/cpinard/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1452, in assert_allclose
verbose=verbose, header=header, equal_nan=equal_nan)
File "/home/cpinard/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 789, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=1e-07, atol=0
Index: 0
(mismatch 13.333333333333329%)
x: array([-1.206306e-04, 9.878260e-01, -2.557970e-01, 3.771263e-01,
-1.863440e-01, 5.914125e-02, 6.415094e-01, 3.132478e-04,
1.672588e-03, -4.412979e-03, -1.300380e-01, -7.609038e-01,
5.438342e-01, 6.241342e-02, -3.342839e-01])
y: array([-1.206305e-04, 9.878260e-01, -2.557970e-01, 3.771263e-01,
-1.863440e-01, 5.914125e-02, 6.415094e-01, 3.132469e-04,
1.672588e-03, -4.412979e-03, -1.300380e-01, -7.609038e-01,
5.438342e-01, 6.241342e-02, -3.342839e-01])
whoops, this is my bad. let me re-setup the environment and see what I can do about this.
@soumith Hi Soumith, do you find the solution for this precision problem? I met this problem in my C++ extension, too.
I also encountered a similar problem. After deleting some paths in the PATH variable that I felt might cause conflicts, I was able to solve it.
Hello,
the compilation of the setup.py in
cpp
is successful but, for/cuda/setup.py
I get the following compile error. Therefore I would like to ask you, if you have an idea what my mistake could be.Best regards
System:
Error log: