cms-gem-daq-project / ctp7_modules

0 stars 13 forks source link

[cleanup] Common blocks related to supOH #162

Open jsturdy opened 4 years ago

jsturdy commented 4 years ago

Brief summary of issue

As part of the porting (#160), I've come across several regularly repeated blocks of code I'm trying to clean up. The first is some check against the FW supported number of OptoHybrid links, e.g.,

void function(uint32_t ohMask)
  uint32_t supOH = readReg(&la, "GEM_AMC.GEM_SYSTEM.CONFIG.NUM_OF_OH");
  if (request->get_key_exists("NOH")) {
    uint32_t NOH_requested = request->get_word("NOH");
    if (NOH_requested <= supOH)
      supOH = NOH_requested;
    else
      LOG4CPLUS_WARN(logger, stdsprintf("NOH requested (%i) > NUM_OF_OH AMC register value (%i), NOH request will be disregarded",NOH_requested,supOH));
  }
  for (size_t ohN = 0; ohN < supOH; ++ohN) {
    if (!((ohMask >> ohN) & 0x1)) {
      continue;
    }

    // here be the meat
  }
}

I tried initially to just simplify this with the following

  uint32_t supOH = readReg(&la, "GEM_AMC.GEM_SYSTEM.CONFIG.NUM_OF_OH");
  uint32_t reqOH = getNumNonzeroBits(ohMask);
  if (reqOH <= supOH)
    supOH = reqOH;
  else
    LOG4CPLUS_WARN(logger, "Requested OptoHybrids ("<< reqOH << ") > NUM_OF_OH AMC register value ("<< supOH << "), request will be reset to register max");
  }
...

I believe this captures the intention of the original code, however it can fail in the actual implementation, e.g., if the ohMask is sparse, but still toggles on a link that is not supported, or, worse, if the ohMask is sparse, not toggling any unsupported OHs, but as a byproduct, limits the range of the loop. When I realized this, I started trying to think of a better solution. This is safe:

  uint32_t supOH = readReg(&la, "GEM_AMC.GEM_SYSTEM.CONFIG.NUM_OF_OH");
  if (ohMask > (0x1<<supOH))
    LOG4CPLUS_WARN(logger, "Requested OptoHybrids (0x"<< std::hex << std::setfill('0') << std::setw(8) << ohMask << std::dec << ") > NUM_OF_OH AMC register value ("<< supOH << "), request will be reset to register max");
  }

Though what we really want to check is whether any bit in the mask is asking for an OptoHybrid that is outside the valid range, and loop only up to those OptoHybrids which are requested/supported.

  uint32_t supOH = readReg(&la, "GEM_AMC.GEM_SYSTEM.CONFIG.NUM_OF_OH");
  uint32_t maxOH = getMaxNonzeroBits(ohMask);
  if (maxOH <= supOH))
    supOH = maxOH;
  else
    LOG4CPLUS_WARN(logger, "Requested OptoHybrids ("<< maxOH << ") > NUM_OF_OH AMC register value ("<< supOH << "), request will be reset to register max");
  }

In the end, I'm not even really sure how this NOH key was intended to be used, so I'd like some feedback before changing things completely.

Additionally, we've had several discussions in the past about taking values from the FW itself (@lpetre-ulb) so this might be the right issue to address our ideal path forward on that.

Types of issue

mexanick commented 4 years ago

Since we are going on safe automatic determination of the hardware masks (e.g. vfat mask, oh mask), I suggest to run binary xor between the provided ohMask and automatically determined one. In this case we are on the safe side w.r.t. number of OH, but also w.r.t. some link with "good" number being not ready.

mexanick commented 4 years ago

correction: should be no xor, but simply and, if a==(a and b) where a is provided mask and b is determined one

jsturdy commented 4 years ago

Do we have an automatic determination of the connected OHs currently (or in the pipeline)? All I'm aware of are the HW constants, and the FW register which states how many are supported. This ties in with #163, since we need to clearly understand when we are fine with taking this proposed action, and when we want to fail because they don't match

AndrewLevin commented 4 years ago

The motivation is described in this pull request and issue:

https://github.com/cms-gem-daq-project/ctp7_modules/pull/51 https://github.com/cms-gem-daq-project/ctp7_modules/issues/49

To me it seems that the optohybrid mask contains or can contain the information in the NOH key, so I don't see any reason to keep the NOH key.

jsturdy commented 4 years ago

OK, so then at least for this case where the NOH key was previously used, I think the solution:

  uint32_t supOH = readReg(&la, "GEM_AMC.GEM_SYSTEM.CONFIG.NUM_OF_OH");
  if (ohMask > (0x1<<supOH))
    LOG4CPLUS_WARN(logger, "Requested OptoHybrids (0x"<< std::hex << std::setfill('0') << std::setw(8) << ohMask << std::dec << ") > NUM_OF_OH AMC register value ("<< supOH << "), request will be reset to register max");
  }

is the appropriate one and will implement it. This will

lpetre-ulb commented 4 years ago

Indeed, this solution is appropriate. The only caveat is that the calling code must not assume it will receive the number of OHs it has asked for and/or that the RPC method will never be called with a ohMask requesting links not supported by the firmware.

This issue also raises the more general question of HW constants determination and the question of parameter range checks.

The HW constants can be split into two main groups:

  1. The GEM station specific constants, such as the number of VFATs. Those constants are stored in the hw_constants.h header file. I see nothing wrong keeping them there, except if parameters are going to be different between, for example, OH releases. I'm specifically thinking about the VFAT to GBT/e-link mapping. I don't know where this data could be stored though.
  2. The backend specific constants such as the number of OptoHybrids. The number of OHs is currently both specified in hw_constants.h and read from the hardware. And the values may or may not match. :/ The only right solution is to always read the values from the HW. But should it be done once at module initialization or on-the-fly when the value is needed? I tend to prefer the former option, but it might not be trivial to implement within the RPC modules framework.

Also, what about the parameter range checks? As discussed here, the RPC modules could not check any parameter and assume that they are all in range. If all the call go though the Hw<Device> classes, which performs the checks, that's okay. If not, I'm convinced they should be check in the RPC methods.

Slightly outside the scope of the ctp7_modules, there are other configuration values that cannot be retrieved from the HW itself are the GEM type (i.e., long/short and m[1-8]) and the chamber/module name. [dream mode on] Actually, the former could be deduced from hardware for GE2/1 with a small modification. Indeed, the master/slave OH location can be set with a micro-switch on the OH itself. The same solution could be implemented for the GEM type (3 bits). Better, an array of resistors could be installed on the GEB itself and read through the SCA GPIO. [/dream mode on]