Open zjjott opened 6 months ago
seems execute ptxas by manual is success
Could you turn on logging (with TF_CPP_VMODULE=asm_compiler=5 TF_CPP_MIN_LOG_LEVEL=0) to find out exact invocation? Your manual invocation is probably missing the optimization flag (-O
iirc?).
If ptxas is hanging, it should hang for the manual invocation as well.
seems execute ptxas by manual is success
Could you turn on logging (with TF_CPP_VMODULE=asm_compiler=5 TF_CPP_MIN_LOG_LEVEL=0) to find out exact invocation? Your manual invocation is probably missing the optimization flag (
-O
iirc?).If ptxas is hanging, it should hang for the manual invocation as well.
yes, I turn on logging with (with TF_CPP_VMODULE=asm_compiler=5 TF_CPP_MIN_LOG_LEVEL=0) , ptxax execute with some /tmp path, but temp file content seems good @cheshire
I'm running Llama-2-1.7b-hf +fsdp+xla but process show
523777 517263 0 80 0 - 0 - 10:22 ? 00:00:00 [ptxas] <defunct>
I have using gdb to debug process:517263
,showing this backtrace:using script:
running script:
commit version(I upgrade from January version to lastest,but also have this issue): pytorch: 7cd7a7aa8e0942da627095b23b94dc89f5a54943 torchxla: 58a412c openxla: 1acf05e
asm_compiler.cc:234] Using /usr/local/cuda/bin/ptxas with version 11.8.89 cuda: 11.8 Driver Version: 470.82.01 device: A100
ptx content:
seems execute ptxas by manual is success