microsoft / msccl

Microsoft Collective Communication Library
Other
304 stars 29 forks source link

A Doubt about the Source Code of MSCCL #61

Open ltrcc opened 5 months ago

ltrcc commented 5 months ago

I am currently studying the source code of MSCCL, I have some doubts about the count variable in this structure. What does it mean?

// TODO: compress this by a lot!
struct mscclTransfer {
  int16_t srcoffset;
  int16_t dstoffset;
  uint8_t srcbuffer; // follow MSCCL_THIS_INPUT/MSCCL_THIS_OUTPUT macros
  uint8_t dstbuffer; // follow MSCCL_THIS_INPUT/MSCCL_THIS_OUTPUT macros
  int16_t depencePointer; // index to the first dependence
  int16_t numDependences; // depencePointer+numDependences indicate the last dependence
  int8_t has_dependence;
  int16_t numReductions; // number of reductions with the same dst
  int16_t reductionPointer; // where the reduction starts
  uint8_t type;
  //--------------//
  uint8_t count; //
  //--------------//
};
ltrcc commented 5 months ago

#define MSCCL_MAX_NUM_ALGOS 4

if (mscclInfo->numberOfMSCCLAlgorithms == MSCCL_MAX_NUM_ALGOS){
      WARN("MSCCL: too many algorithms (%d) specified in environment variable MSCCL_XML_FILES. The rest will be ignored.", mscclInfo->numberOfMSCCLAlgorithms);
      break;
    }

Furthermore, why limit the number of algorithms?