Open jufenghao opened 1 year ago
遇到相同问题。
+1
又搜了一下 https://github.com/NVIDIA/cutlass/issues/4#issuecomment-1054058714
This error also appears if you set GPU architecture that does not support fast FP16 (i.e. sm_35, sm_52). Setting it to sm_70 (for example) allows to compile, but you could not use it if your GPU does not support it.
我这块卡正好是sm_35,所以似乎是因为GPU太旧了的关系。目前无解
/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default/cu12.1.105_sm_52/jit/opkey0_broadcast_toTx_float16DIM_3__BCAST_3opkey1_binaryTx_float16Ty_float16___hash_f3d27a3882838ef3_op.cc(52): error: more than one conversion function from "jittor::float16" to a built-in type applies: function "__half::operator float() const" (declared at line 217 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator short() const" (declared at line 235 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator unsigned short() const" (declared at line 238 of /usr/local/cuda/include/cuda_fp16.hpp) function "__half::operator int() const" (declared at line 241 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator unsigned int() const" (declared at line 244 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator long long() const" (declared at line 247 of /usr/local/cuda/include/cuda_fp16.hpp) function "__half::operator unsigned long long() const" (declared at line 250 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator __nv_bool() const" (declared at line 254 of /usr/local/cuda/include/cuda_fp16.hpp) op1_zp[op1_i] = ((op1_xp[op1_i])+(op0_zd )); ^
/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default/cu12.1.105_sm_52/jit/opkey0_broadcast_toTx_float16DIM_3__BCAST_3opkey1_binaryTx_float16Ty_float16___hash_f3d27a3882838ef3_op.cc(52): error: more than one conversion function from "jittor::float16" to a built-in type applies: function "__half::operator float() const" (declared at line 217 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator short() const" (declared at line 235 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator unsigned short() const" (declared at line 238 of /usr/local/cuda/include/cuda_fp16.hpp) function "__half::operator int() const" (declared at line 241 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator unsigned int() const" (declared at line 244 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator long long() const" (declared at line 247 of /usr/local/cuda/include/cuda_fp16.hpp) function "__half::operator unsigned long long() const" (declared at line 250 of /usr/local/cuda/include/cuda_fp16.hpp) function "half::operator __nv_bool() const" (declared at line 254 of /usr/local/cuda/include/cuda_fp16.hpp) op1_zp[op1_i] = ((op1_xp[op1_i])+(op0_zd )); ^
2 errors detected in the compilation of "/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default/cu12.1.105_sm_52/jit/opkey0_broadcast_toTx_float16DIM_3__BCAST_3opkey1_binaryTx_float16Ty_float16___hash_f3d27a3882838ef3_op.cc". Traceback (most recent call last): File "/home/JittorLLMs/lib/python3.9/site-packages/gradio/routes.py", line 408, in run_predict output = await app.get_blocks().process_api( File "/home/JittorLLMs/lib/python3.9/site-packages/gradio/blocks.py", line 1315, in process_api result = await self.call_function( File "/home/JittorLLMs/lib/python3.9/site-packages/gradio/blocks.py", line 1059, in call_function prediction = await utils.async_iteration(iterator) File "/home/JittorLLMs/lib/python3.9/site-packages/gradio/utils.py", line 514, in async_iteration return await iterator.anext() File "/home/JittorLLMs/lib/python3.9/site-packages/gradio/utils.py", line 507, in anext return await anyio.to_thread.run_sync( File "/home/JittorLLMs/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/JittorLLMs/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/home/JittorLLMs/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "/home/JittorLLMs/lib/python3.9/site-packages/gradio/utils.py", line 490, in run_sync_iterator_async return next(iterator) File "/home/JittorLLMs/JittorLLMs/web_demo.py", line 12, in predict for response, history in model.run_web_demo(input, history): File "/home/JittorLLMs/JittorLLMs/models/chatglm/init.py", line 42, in run_web_demo yield self.run(input_text, history=history) File "/home/JittorLLMs/JittorLLMs/models/chatglm/init.py", line 45, in run return self.model.chat(self.tokenizer, text, history=history) File "/home/JittorLLMs/lib/python3.9/site-packages/jittor/init.py", line 118, in inner ret = func(*args, kw) File "/root/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 1233, in chat outputs = self.generate(input_ids, gen_kwargs) File "/home/JittorLLMs/lib/python3.9/site-packages/jittor/init.py", line 118, in inner ret = func(*args, *kw) File "/home/JittorLLMs/lib/python3.9/site-packages/transformers/generation/utils.py", line 1437, in generate return self.sample( File "/home/JittorLLMs/lib/python3.9/site-packages/transformers/generation/utils.py", line 2443, in sample outputs = self( File "/home/JittorLLMs/lib/python3.9/site-packages/jtorch/nn/init.py", line 16, in call return self.forward(args, kw) File "/root/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 1138, in forward transformer_outputs = self.transformer( File "/home/JittorLLMs/lib/python3.9/site-packages/jtorch/nn/init.py", line 16, in call return self.forward(*args, *kw) File "/root/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 973, in forward layer_ret = layer( File "/home/JittorLLMs/lib/python3.9/site-packages/jtorch/nn/init.py", line 16, in call return self.forward(args, **kw) File "/root/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 614, in forward attention_outputs = self.attention( File "/home/JittorLLMs/lib/python3.9/site-packages/jtorch/nn/init.py", line 16, in call return self.forward(*args, *kw) File "/root/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 454, in forward cos, sin = self.rotary_emb(q1, seq_len=position_ids.max() + 1) File "/home/JittorLLMs/lib/python3.9/site-packages/jtorch/nn/init.py", line 16, in call return self.forward(args, **kw) File "/root/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 200, in forward if self.max_seq_len_cached is None or (seq_len > self.max_seq_len_cached): File "/home/JittorLLMs/lib/python3.9/site-packages/jittor/init__.py", line 2013, in to_bool return ori_bool(v.item()) RuntimeError: Wrong inputs arguments, Please refer to examples(help(jt.item)).
Types of your inputs are: self = Var, args = (),
The function declarations are: ItemData item()
Failed reason:[f 0510 13:24:40.710840 08 parallel_compiler.cc:330] Error happend during compilation: [Error] source file location:/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default/cu12.1.105_sm_52/jit/opkey0_broadcast_toTx_float16DIM_3__BCAST_3opkey1_binary__Tx_float16Ty_float16___hash_f3d27a3882838ef3_op.cc Compile fused operator(16/19)failed:[Op(12871:0:1:1:i1:o1:s0,broadcast_to->12872),Op(12873:0:1:1:i2:o1:s0,binary.add->12874),]
Reason: [f 0510 13:24:39.889697 04:C1 log.cc:608] Check failed ret(256) == 0(0) Run cmd failed: "/usr/local/cuda/bin/nvcc" "/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default/cu12.1.105_sm_52/jit/opkey0_broadcast_toTx_float16DIM_3__BCAST_3opkey1_binaryTx_float16Ty_float16___hash_f3d27a3882838ef3_op.cc" -std=c++14 -Xcompiler -fPIC -Xcompiler -march=native -Xcompiler -fdiagnostics-color=always -lstdc++ -ldl -shared -I"/home/JittorLLMs/lib/python3.9/site-packages/jittor/src" -I/usr/local/include/python3.9 -I/usr/local/include/python3.9 -DHAS_CUDA -DIS_CUDA -I"/usr/local/cuda/include" -I"/home/JittorLLMs/lib/python3.9/site-packages/jittor/extern/cuda/inc" -lcudart -L"/usr/local/cuda/lib64" -Xlinker -rpath="/usr/local/cuda/lib64" -I"/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default/cu12.1.105_sm_52" -L"/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default/cu12.1.105_sm_52" -Xlinker -rpath="/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default/cu12.1.105_sm_52" -L"/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default" -Xlinker -rpath="/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default" -l:"jit_utils_core.cpython-39-x86_64-linux-gnu".so -l:"jittor_core.cpython-39-x86_64-linux-gnu".so -x cu --cudart=shared -ccbin="/opt/rh/devtoolset-10/root/usr/bin/g++" --use_fast_math -w -I"/home/JittorLLMs/lib/python3.9/site-packages/jittor/extern/cuda/inc" -arch=compute_52 -code=sm_52 -o "/root/.cache/jittor/jt1.3.7/g++10.2.1/py3.9.12/Linux-3.10.0-1x61/IntelRXeonRCPUx09/default/cu12.1.105_sm_52/jit/opkey0_broadcast_toTx_float16__DIM_3BCAST_3opkey1_binary__Tx_float16Ty_float16_____hash_f3d27a3882838ef3_op.so"