Closed mahmoodn closed 4 years ago
There is no need to change arch number to 70 in tools/*/Makefile , but if you want to do that then change the lines in the opcode_hist.cu that use __ballot
in __ballot_sync
.
But again there is no need to change arch number to 70 in tools/*/Makefile to have the tool working on Volta, since the tools are compiled with embedded PTX and the driver will JIT compile for Volta on the fly.
Excuse me, these
/* all the active threads will compute the active mask */
const int active_mask = __ballot_sync(1);
/* compute the predicate mask */
const int predicate_mask = __ballot_sync(predicate);
have the following errors:
opcode_hist.cu(86): error: too few arguments in function call
opcode_hist.cu(88): error: too few arguments in function call
I accept that changing sm number is not necessary. Maybe this should be considered for future releases. Thanks for the effort. It is a great tool.
Future releases of NVBit will automatically use the correct __ballot
or __ballot_sync
depending on the version of nvcc used (the reason of this "issue" is that __ballot
is going to be deprecated soon in favor of __ballot_sync
).
In the meanwhile you can use.
__ballot_sync(__activemask(), 1)
and
__ballot_sync(__activemask(), predicate)
respectively.
The function __ballot_sync takes 2 arguments, see a nice explanation of them at https://devblogs.nvidia.com/using-cuda-warp-level-primitives/
I have changed the arch number to 70 in tools/*/Makefile and here is the error that I see