Closed martin-belanger closed 3 weeks ago
The recommended way of writing tuner plugins is to duplicate nccl_tuner.h
and other nccl_common.h
definitions from NCCL source tree to nccl/tuner.h
and nccl/common.h
in your tuner plugin source tree. Please take a look at the tuner plugin example: https://github.com/NVIDIA/nccl/tree/master/ext-tuner/example. The process for writing external plugins is described here (Headers Management
section) for the network plugin, but also applies to all other plugins, tuner included.
Hi @gcongiu - Thanks for the pointers. I was already familiar with the example code, but I had not noticed the "Headers Management" section in the documentation.
I'm just wondering, however, if by duplicating definitions we don't risk having definitions change in upstream NCCL and not know about it in an out-of-tree plugin project? We almost need the equivalent of a "nccl-dev" package so that one can install development headers w/o having to duplicate stuff here and there. Just a thought... :wink:
Hi @martin-belanger, NCCL plugin APIs are all versioned and backward compatible. Even in the eventuality NCCL internal versions get bumped up, your older tuner plugin remains compatible and functional.
This is guaranteed by a compatibility layer in the tuner code: https://github.com/NVIDIA/nccl/blob/master/src/misc/tuner.cc. Look for functions named ncclTuner_vX_as_vY_*
(where X < Y). Tuner API version 3 is backward compatible with version 2. Tuner API version 1 has been deprecated though and is no longer supported. Thus, if your tuner plugin was using that version you should re-implement it to support at least version 2.
In general, you are encouraged to write multiple versions of your plugins, supporting different versions of the API guarantees that your plugin can work with different (older) NCCL versions. Hope this helps.
NCCL plugin APIs are all versioned and backward compatible. Even in the eventuality NCCL internal versions get bumped up, your older tuner plugin remains compatible and functional.
Awesome! I am therefore closing this pull-request.
To build an out-of-tree Tuner plugin, we need to have access to definitions from
nccl_common.h
andnccl_tuner.h
. These header files will be installed in/usr/local/include/
by default, which is one of GCC's default search paths for header files.