When I use the tracer in dev branch to generate traces for polybench-atax, there is some non-matching output, but when I directly run it on the same GPU, there is no non-matching output.
The log of tracing:
------------- NVBit (NVidia Binary Instrumentation Tool v1.5.5) Loaded --------------
NVBit core environment variables (mostly for nvbit-devs):
NVDISASM = nvdisasm - override default nvdisasm found in PATH
NOBANNER = 0 - if set, does not print this banner
---------------------------------------------------------------------------------
INSTR_BEGIN = 0 - Beginning of the instruction interval where to apply instrumentation
INSTR_END = 4294967295 - End of the instruction interval where to apply instrumentation
EXCLUDE_PRED_OFF = 1 - Exclude predicated off instruction from count
TRACE_LINEINFO = 0 - Include source code line info at the start of each traced line. The target binary must be compiled with -lineinfo or --generate-line-info
DYNAMIC_KERNEL_LIMIT_END = 0 - Limit of the number kernel to be printed, 0 means no limit
DYNAMIC_KERNEL_LIMIT_START = 0 - start to report kernel from this kernel id, 0 means starts from the beginning, i.e. first kernel
ACTIVE_FROM_START = 1 - Start instruction tracing from start or wait for cuProfilerStart and cuProfilerStop. If set to 0, DYNAMIC_KERNEL_LIMIT options have no effect
TOOL_VERBOSE = 0 - Enable verbosity inside the tool
TOOL_COMPRESS = 1 - Enable traces compression
TOOL_TRACE_CORE = 0 - write the core id in the traces
TERMINATE_UPON_LIMIT = 0 - Stop the process once the current kernel > DYNAMIC_KERNEL_LIMIT_END
USER_DEFINED_FOLDERS = 0 - Uses the user defined folder TRACES_FOLDER path environment
----------------------------------------------------------------------------------------------------
setting device 0 with name NVIDIA RTX A6000
Writing results to /gpu_perf_model/accel-sim-framework/hw_run/traces/device-0/11.2/polybench-atax/NO_ARGS/traces//kernel-1.trace
Writing results to /gpu_perf_model/accel-sim-framework/hw_run/traces/device-0/11.2/polybench-atax/NO_ARGS/traces//kernel-2.trace
GPU Runtime: 5.762122s
CPU Runtime: 0.092334s
Non-Matching CPU-GPU Outputs Beyond Error Threshold of 0.50 Percent: 4095
Processing file /gpu_perf_model/accel-sim-framework/hw_run/traces/device-0/11.2/polybench-atax/NO_ARGS/traces/kernel-1.trace
Processing file /gpu_perf_model/accel-sim-framework/hw_run/traces/device-0/11.2/polybench-atax/NO_ARGS/traces/kernel-2.trace
The log of direct run:
root@d0e87f6eed3d:/# /gpu_perf_model/accel-sim-framework/gpu-app-collection/src/..//bin/11.2/release/polybench-atax
setting device 0 with name NVIDIA RTX A6000
GPU Runtime: 0.002029s
CPU Runtime: 0.026450s
Non-Matching CPU-GPU Outputs Beyond Error Threshold of 0.50 Percent: 0
There is also a similar case for gesummv.
Does anyone also have this problem?
(Accel-sim: dev branch, NVbit Tracer:v1.5.5, CUDA:11.2, GPU:A6000)
Hello,
When I use the tracer in dev branch to generate traces for polybench-atax, there is some non-matching output, but when I directly run it on the same GPU, there is no non-matching output.
The log of tracing:
The log of direct run:
There is also a similar case for gesummv.
Does anyone also have this problem? (Accel-sim: dev branch, NVbit Tracer:v1.5.5, CUDA:11.2, GPU:A6000)