Closed unw9527 closed 3 years ago
Hi @unw9527 ,
Could you please upgrade jittor and clean the cache by rm -r ~/.cache/jittor
?
If there are still problems, please let me know.
Hi @lzhengning , Thanks for your reply. I did what you said above and now jittor's version is 1.2.3.73 (originally 1.2.3.71). And I clean the cache as well. But it gives me another error message now, as follows.
nvcc fatal : Value 'c++14' is not defined for option 'std'
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor_utils/__init__.py", line 152, in do_compile
return cc.cache_compile(cmd, cache_path, jittor_path)
RuntimeError: [f 0721 21:17:00.171152 12 log.cc:387] Check failed ret(256) == 0(0) Run cmd failed: cd /home/xxx/.cache/jittor/default/g++ && /usr/local/cuda/bin/nvcc /home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src/misc/nan_checker.cu -std=c++14 -Xcompiler -fPIC -Xcompiler -march=native -Xcompiler -fdiagnostics-color=always -I/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -DHAS_CUDA -I'/usr/local/cuda/include' -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc' -I/home/xxx/.cache/jittor/default/g++ -O2 -x cu --cudart=shared -ccbin='/usr/bin/g++' -w -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc' -c -o /home/xxx/.cache/jittor/default/g++/obj_files/nan_checker.cu.o
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "train_seg.py", line 10, in <module>
import jittor as jt
File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/__init__.py", line 18, in <module>
from . import compiler
File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/compiler.py", line 1106, in <module>
compile(cc_path, cc_flags+opt_flags, files, 'jittor_core'+extension_suffix)
File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/compiler.py", line 93, in compile
jit_utils.run_cmds(cmds, cache_path, jittor_path, "Compiling "+base_output)
File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor_utils/__init__.py", line 193, in run_cmds
for i,_ in enumerate(p.imap_unordered(do_compile, cmds)):
File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/multiprocessing/pool.py", line 748, in next
raise value
RuntimeError: [f 0721 21:17:00.171152 12 log.cc:387] Check failed ret(256) == 0(0) Run cmd failed: cd /home/xxx/.cache/jittor/default/g++ && /usr/local/cuda/bin/nvcc /home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src/misc/nan_checker.cu -std=c++14 -Xcompiler -fPIC -Xcompiler -march=native -Xcompiler -fdiagnostics-color=always -I/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -DHAS_CUDA -I'/usr/local/cuda/include' -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc' -I/home/xxx/.cache/jittor/default/g++ -O2 -x cu --cudart=shared -ccbin='/usr/bin/g++' -w -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc' -c -o /home/xxx/.cache/jittor/default/g++/obj_files/nan_checker.cu.o
I found that this might be caused by the low version of CUDA. After I ran the command python3 -m jittor_utils.install_cuda
suggested by Jittor team, this error message was gone and it gave me an error message like before:
[i 0722 13:02:24.137653 08 compiler.py:869] Jittor(1.2.3.73) src: /home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor
[i 0722 13:02:24.143181 08 compiler.py:870] g++ at /usr/bin/g++(5.4.0)
[i 0722 13:02:24.143247 08 compiler.py:871] cache_path: /home/xxx/.cache/jittor/default/g++
[i 0722 13:02:24.155516 08 install_cuda.py:37] cuda_driver_version: [11, 2]
[i 0722 13:02:24.161784 08 __init__.py:286] Found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc(11.2.152) at /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc.
[i 0722 13:02:24.214301 08 __init__.py:286] Found gdb(7.11.1) at /usr/bin/gdb.
[i 0722 13:02:24.221481 08 __init__.py:286] Found addr2line(2.26.1) at /usr/bin/addr2line.
[i 0722 13:02:24.239643 08 compiler.py:958] py_include: -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m
[i 0722 13:02:24.258129 08 compiler.py:960] extension_suffix: .cpython-37m-x86_64-linux-gnu.so
[i 0722 13:02:24.422251 08 compiler.py:1098] OS type:ubuntu OS key:ubuntu
[i 0722 13:02:24.423282 08 __init__.py:178] Total mem: 62.83GB, using 16 procs for compiling.
[i 0722 13:02:24.563519 08 jit_compiler.cc:22] Load cc_path: /usr/bin/g++
[i 0722 13:02:24.652271 08 init.cc:55] Found cuda archs: [61,]
[i 0722 13:02:24.666418 08 __init__.py:286] Found mpicc(1.10.2) at /usr/bin/mpicc.
[i 0722 13:02:24.704353 08 compiler.py:667] handle pyjt_include/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/mpi/inc/mpi_warper.h
[i 0722 13:02:24.724936 08 compile_extern.py:347] Downloading nccl...
[i 0722 13:02:24.785298 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/cublas.h
[i 0722 13:02:24.797011 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcublas.so
[i 0722 13:02:24.797106 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcublasLt.so.11
[i 0722 13:02:25.036328 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/cudnn.h
[i 0722 13:02:25.056544 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn.so.8
[i 0722 13:02:25.056619 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_ops_infer.so.8
[i 0722 13:02:25.059087 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_ops_train.so.8
[i 0722 13:02:25.059688 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_cnn_infer.so.8
[i 0722 13:02:25.083144 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_cnn_train.so.8
[i 0722 13:02:25.096104 08 compiler.py:667] handle pyjt_include/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/cudnn/inc/cudnn_warper.h
[i 0722 13:02:25.351712 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/curand.h
[i 0722 13:02:25.374880 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcurand.so
[i 0722 13:02:25.400630 08 cuda_flags.cc:26] CUDA enabled.
name: coseg-alien
0: 0%| | 0/37 [00:00<?, ?it/s]/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc(40): error: calling a constexpr __host__ function("floor") from a __global__ function("func_bc2f95c82b48131a_0") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
1 error detected in the compilation of "/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc".
[e 0722 13:02:28.692090 60:C8 parallel_compiler.cc:261] [Error] source file location: /home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc
[e 0722 13:02:28.692354 60:C8 parallel_compiler.cc:264] Compile fused operator(18/56) failed: [Op(0x55ab6354f100:0:0:1:i0:o1:s0,array->0x55ab6354e9a0),Op(0x55ab6354dcf0:0:0:1:i1:o1:s0,broadcast_to->0x55ab6354d5c0),Op(0x55ab6354e300:0:0:1:i2:o1:s0,binary.mod->0x55ab6354e390),]
Reason: [f 0722 13:02:28.691857 60:C8 log.cc:387] Check failed ret(256) == 0(0) Run cmd failed: cd /home/xxx/.cache/jittor/default/g++ && /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc '/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc' -std=c++14 -Xcompiler -fPIC -Xcompiler -march=native -Xcompiler -fdiagnostics-color=always -I/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -DHAS_CUDA -I'/home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include' -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc' -lstdc++ -ldl -shared -x cu --cudart=shared -ccbin='/usr/bin/g++' --use_fast_math -w -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc' -arch=compute_61 -code=sm_61 -o '/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.so'
0: 0%| | 0/37 [00:08<?, ?it/s]
Traceback (most recent call last):
File "train_seg.py", line 162, in <module>
test(net, test_dataset, writer, 0, args)
File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/__init__.py", line 257, in inner
ret = func(*args, **kw)
File "train_seg.py", line 64, in test
preds = np.argmax(outputs.data, axis=1)
RuntimeError: Wrong inputs arguments, Please refer to examples(help(jt.data)).
Types of your inputs are:
self = Var,
The function declarations are:
inline DataView data()
Failed reason:[f 0722 13:02:34.479193 08 parallel_compiler.cc:316] Error happend during compilation, see error above.
I reproduced this error in the latest jittor. This seems to be a bug that was introduced recently, and will be fixed soon.
Can you try to install jittor by python3.7 -m pip install jittor==1.2.3.48
? I have tested this version and it works.
It works. Thanks.
Closed because the latest jittor has fixed the bugs.
no! they are not fix this bug
Hello. Thanks for your work. However, when I try to run the test script of coseg-alien, it gives me an error message like this:
Any ideas on why this happens? I have downloaded the data of coseg-alien via the shell script provided.