Open yinengy opened 3 years ago
you could try this to override the nvdisasm
NVDISASM=/usr/local/cuda-10.2/bin/nvdisasm LD_PRELOAD=/home/username/nvbit_release/tools/mem_printf/mem_printf.so python3 two_layer_net_tensor.py
but it should work without.
What do you see if you type env
?
NVDISASM=/usr/local/cuda-10.2/bin/nvdisasm
~$ NVDISASM=/usr/local/cuda-10.2/bin/nvdisasm LD_PRELOAD=/home/username/nvbit_release/tools/mem_printf/mem_printf.so python3 two_layer_net_tensor.py
------------- NVBit (NVidia Binary Instrumentation Tool v1.4) Loaded --------------
NVBit core environment variables (mostly for nvbit-devs):
NVDISASM = /usr/local/cuda-10.2/bin/nvdisasm - override default nvdisasm found in PATH
NOBANNER = 0 - if set, does not print this banner
---------------------------------------------------------------------------------
INSTR_BEGIN = 0 - Beginning of the instruction interval where to apply instrumentation
INSTR_END = 4294967295 - End of the instruction interval where to apply instrumentation
TOOL_VERBOSE = 0 - Enable verbosity inside the tool
----------------------------------------------------------------------------------------------------
ERROR: /usr/local/cuda-10.2/bin/nvdisasm not found on PATH!!!
type env will give me:
XDG_SESSION_ID=1
TERM_PROGRAM=vscode
TERM=xterm-256color
SHELL=/bin/bash
AMD_ENTRYPOINT=vs/server/remoteExtensionHostProcess
SSH_CLIENT=73.144.154.30 58777 22
TERM_PROGRAM_VERSION=1.48.2
USER=yinengy
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
LD_LIBRARY_PATH=/home/yinengy/distro/install/lib:/usr/local/cuda-10.2/lib64:/home/yinengy/distro/install/lib:/home/yinengy/distro/install/lib:/home/yinengy/distro/install/lib:/usr/local/cuda-10.2/lib64
PATH=/home/yinengy/distro/install/bin:/usr/local/cuda-10.2/bin:/home/yinengy/.vscode-server/bin/a0479759d6e9ea56afa657e454193f72aef85bd0/bin:/home/yinengy/distro/install/bin:/home/yinengy/distro/install/bin:/home/yinengy/bin:/home/yinengy/.local/bin:/home/yinengy/distro/install/bin:/usr/local/cuda-10.2/bin:/home/yinengy/.vscode-server/bin/a0479759d6e9ea56afa657e454193f72aef85bd0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
MAIL=/var/mail/yinengy
PWD=/home/yinengy
LUA_PATH=/home/yinengy/.luarocks/share/lua/5.1/?.lua;/home/yinengy/.luarocks/share/lua/5.1/?/init.lua;/home/yinengy/distro/install/share/lua/5.1/?.lua;/home/yinengy/distro/install/share/lua/5.1/?/init.lua;./?.lua;/home/yinengy/distro/install/share/luajit-2.1.0-beta1/?.lua;/usr/local/share/lua/5.1/?.lua;/usr/local/share/lua/5.1/?/init.lua
LANG=en_US.UTF-8
LUA_CPATH=/home/yinengy/distro/install/lib/?.so;/home/yinengy/.luarocks/lib/lua/5.1/?.so;/home/yinengy/distro/install/lib/lua/5.1/?.so;/home/yinengy/distro/install/lib/?.so;./?.so;/usr/local/lib/lua/5.1/?.so;/usr/local/lib/lua/5.1/loadall.so
HOME=/home/yinengy
SHLVL=4
VSCODE_GIT_ASKPASS_MAIN=/home/yinengy/.vscode-server/bin/a0479759d6e9ea56afa657e454193f72aef85bd0/extensions/git/dist/askpass-main.js
PIPE_LOGGING=true
DYLD_LIBRARY_PATH=/home/yinengy/distro/install/lib:/home/yinengy/distro/install/lib:/home/yinengy/distro/install/lib:/home/yinengy/distro/install/lib:
LOGNAME=yinengy
VSCODE_GIT_IPC_HANDLE=/run/user/1001/vscode-git-0ff09c2746.sock
XDG_DATA_DIRS=/usr/local/share:/usr/share:/var/lib/snapd/desktop
SSH_CONNECTION=73.144.154.30 58777 10.128.0.3 22
VSCODE_IPC_HOOK_CLI=/tmp/vscode-ipc-f3bae490-b524-46f8-85ad-dd9184ba7d9d.sock
LESSOPEN=| /usr/bin/lesspipe %s
VSCODE_GIT_ASKPASS_NODE=/home/yinengy/.vscode-server/bin/a0479759d6e9ea56afa657e454193f72aef85bd0/node
GIT_ASKPASS=/home/yinengy/.vscode-server/bin/a0479759d6e9ea56afa657e454193f72aef85bd0/extensions/git/dist/askpass.sh
XDG_RUNTIME_DIR=/run/user/1001
VERBOSE_LOGGING=true
LESSCLOSE=/usr/bin/lesspipe %s %s
COLORTERM=truecolor
_=/usr/bin/env
echo $PATH gives:
/home/yinengy/distro/install/bin:/usr/local/cuda-10.2/bin:/home/yinengy/.vscode-server/bin/a0479759d6e9ea56afa657e454193f72aef85bd0/bin:/home/yinengy/distro/install/bin:/home/yinengy/distro/install/bin:/home/yinengy/bin:/home/yinengy/.local/bin:/home/yinengy/distro/install/bin:/usr/local/cuda-10.2/bin:/home/yinengy/.vscode-server/bin/a0479759d6e9ea56afa657e454193f72aef85bd0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
BTW, do you have which
program in your system? Maybe because it is missing.
which
I have the which
program in my system. I used it in my first comment to show the path of nvdisasm
.
Have you tried running a non python application, for instance the simple test-app that comes with NVBit? It is the first time that we see this error, I suspect something very peculiar about your machine configuration.
I have used NVBit on normal .cu program for a several month without any problems. It is ok to run in native cuda like rodinia 3.1. But failed for caffe, torch 7, and pytorch. For caffe, a core dump will happen when I try to instrument it. For torch 7, instrument never works, no error reported, just nothing happen, the instrument doesn’t take effect, only a banner is printed.
The environment is a server on Google Cloud Platform with Tesla P4.
Can you try to use CUDA_INJECTION64_PATH
instead of LD_PRELOAD
?
Also, can you please send exact instructions on how to reproduce the core dump
situation?
As usual, no promises we will be able to fix it, but we will try to take a look when we find some time.
Thanks!
Thank you for your help. Replace LD_PRELOAD
by CUDA_INJECTION64_PATH
gives the same error message. Nothing changed.
For caffe, compile and install it and then cd to the root directory of caffe.
~/caffe$ LD_PRELOAD=/home/yinengy/nvbit_release/tools/mem_printf/mem_printf.so .build_release/tools/caffe
------------- NVBit (NVidia Binary Instrumentation Tool v1.4) Loaded --------------
NVBit core environment variables (mostly for nvbit-devs):
NVDISASM = nvdisasm - override default nvdisasm found in PATH
NOBANNER = 0 - if set, does not print this banner
---------------------------------------------------------------------------------
INSTR_BEGIN = 0 - Beginning of the instruction interval where to apply instrumentation
INSTR_END = 4294967295 - End of the instruction interval where to apply instrumentation
TOOL_VERBOSE = 0 - Enable verbosity inside the tool
----------------------------------------------------------------------------------------------------
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0901 18:22:27.429435 3651 parallel.cpp:48] P2PManager::Init @ <my server name>
I0901 18:22:27.559656 3651 gpu_memory.cpp:82] GPUMemory::Manager initialized
I0901 18:22:27.560271 3651 gpu_memory.cpp:84] Total memory: 7981694976, Free: 7852195840, dev_info[0]: total=7981694976 free=7852195840
I0901 18:22:28.731178 3651 caffe.cpp:702] This is NVCaffe 0.17.3 started at Tue Sep 1 18:22:27 2020
I0901 18:22:28.731402 3651 caffe.cpp:704] CuDNN version: USE_CUDNN is not defined
I0901 18:22:28.731448 3651 caffe.cpp:705] CuBLAS version: 10202
I0901 18:22:28.731489 3651 caffe.cpp:706] CUDA version: 10020
I0901 18:22:28.731526 3651 caffe.cpp:707] CUDA driver version: 10020
I0901 18:22:28.731570 3651 caffe.cpp:708] Arguments:
[0]: .build_release/tools/caffe
caffe: command line brew
usage: caffe <command> <args>
commands:
train train or finetune a model
test score a model
device_query show GPU diagnostic information
time benchmark model execution time
Flags from tools/caffe.cpp:
-ap_version (Average Precision type for object detection) type: string
default: "11point"
-gpu (Optional; run in GPU mode on given device IDs separated by ', '.Use
'-gpu all' to run on all available GPUs. The effective training batch
size is multiplied by the number of devices.) type: string default: ""
-iterations (The number of iterations to run.) type: int32 default: 50
-level (Optional; network level.) type: int32 default: 0
-model (The model definition protocol buffer text file.) type: string
default: ""
-phase (Optional; network phase (TRAIN or TEST). Only used for 'time'.)
type: string default: ""
-show_per_class_result (Show per class result for object detection)
type: bool default: true
-sighup_effect (Optional; action to take when a SIGHUP signal is received:
snapshot, stop or none.) type: string default: "snapshot"
-sigint_effect (Optional; action to take when a SIGINT signal is received:
snapshot, stop or none.) type: string default: "stop"
-snapshot (Optional; the snapshot solver state to resume training.)
type: string default: ""
-solver (The solver definition protocol buffer text file.) type: string
default: ""
-stage (Optional; network stages (not to be confused with phase), separated
by ','.) type: string default: ""
-weights (Optional; the pretrained weights to initialize finetuning,
separated by ', '. Cannot be set simultaneously with snapshot.)
type: string default: ""
*** Aborted at 1598984548 (unix time) try "date -d @1598984548" if you are using GNU date ***
PC: @ 0x7f63d037e259 Nvbit::module_unloading()
*** SIGSEGV (@0x0) received by PID 3651 (TID 0x7f63d080e880) from PID 0; stack trace: ***
@ 0x7f63ccff94c0 (unknown)
@ 0x7f63d037e259 Nvbit::module_unloading()
@ 0x7f63d03863a8 nvbitToolsCallbackFunc()
@ 0x7f63cc16ab23 (unknown)
@ 0x7f63cbfa8745 (unknown)
@ 0x7f63cbeb39fa (unknown)
@ 0x7f63cc03d77a cuModuleUnload
@ 0x7f63ce327e4f (unknown)
@ 0x7f63ce329190 (unknown)
@ 0x7f63ce32e077 (unknown)
@ 0x7f63ce32e492 (unknown)
@ 0x7f63ce31fb0c (unknown)
@ 0x7f63ce320355 (unknown)
@ 0x7f63ccffe37a __cxa_finalize
@ 0x7f63ce3077e6 (unknown)
@ 0x7f63ce35d611 (unknown)
@ 0x7f63ccffe008 (unknown)
@ 0x7f63d088f558 (unknown)
Segmentation fault (core dumped)
Such error won't be triggered if run without NVBit.
Note: an empty NVBit tool (just do nothing) will also trigger this fault.
And I am wondering how to instrument Torch 7. Just LD_PRELOAD=xxxxx th <file to execute>
? It seems doesn't work for me. Thanks.
For more information, NVBit works on cuDNN perfectly.
I got the error message when I try to instrument pytorch program with mem_printf.so.
But it is actually on PATH.
Use CUDA 9.2 with NVBit 1.1 will also leads to a similar error message.
Thanks!