codes-org / codes

The Co-Design of Exascale Storage Architectures (CODES) simulation framework builds upon the ROSS parallel discrete event simulation engine to provide high-performance simulation utilities and models for building scalable distributed systems simulations
Other
40 stars 16 forks source link

Credit Delay Configuration #190

Closed nmcglo closed 4 years ago

nmcglo commented 4 years ago

Credit Delay is a singular configuration parameter that is used to help determine the time at which the credit will arrive at the receiving router/terminal. For basically all models, the same value is used for any credit to be sent regardless of what type of link it will be sent on.

@yao-kang brought this to attention as it isn't really the most accurate way to handle this delay.

The challenge is that I want to avoid unnecessary fragmenting of the configuration language necessary to configure various models of CODES. Different models will of course have different needs for configuring but we want to minimize the differences between them (especially subtle, similar differences).

Right now, on all models in CODES that have a credit based flow control system, there is credit_delay to configure the time a credit takes to be delivered once it is sent.

If we were to change this, all dragonfly like models with different CN, Local, and Global link bandwidths, would have a different credit delay for each of them: cn_credit_delay, local_credit_delay, global_credit_delay.

Changing this will, unless done cleverly, break old experiments that expect to configure a singular credit_delay parameter. As such, this change, if implemented, will have to be done carefully - and done in a way that isn't too unique to any one model.

Due to how much it might change how users configure CODES, it will justify a new release.

nmcglo commented 4 years ago

The credit size for dragonfly models will now be configurable from the configuration file and will no longer be a #DEFINE compiled value. If no credit_size value is specified in the configuration file, then it will use a default value of 8.

You will now be able to also configure the credit delay as such:

One of the following (exclusive or): A) Set a general credit_delay that will be the delay used for all credit messages (plus the lookahead value and noise) B) Set specific *_credit_delay values for cn/local/global. Can set 1, 2 or all 3. Any that are not set, the delay will be calculated based on the credit size and the corresponding link bandwidth C) Set a auto_credit_delay="1" flag in the configuration file that will calculate each credit delay based on the credit size and the corresponding bandwidth. D) Don't set any credit_delay or *_credit_delay or auto_credit_delay values in the configuration and all credit delays will be set to be based off of the local_bandwidth. This has been the behavior for years and I wanted try and make sure that the behavior expected from old configuration files isn't too semantically different. I'm open to making the behavior of auto_credit_delay="1" the default, though as it seems more accurate anyway.

nmcglo commented 4 years ago

Addressed in #187