quattor / ncm-cdispd

Node Configuration Manager Configuration Dispatch Daemon
www.quattor.org
Other
2 stars 5 forks source link

restore the old ICLIST on ncm-ncd failure #23

Closed stdweird closed 8 years ago

stdweird commented 8 years ago

Fixes #22

stdweird commented 8 years ago

@jouvin i'm not an ncm-cdispd expert, and i'm clearly missing some background, but this fixes our problem.

hpcugentbot commented 8 years ago

Refer to this link for build results (access rights to CI server needed): https://jenkins1.ugent.be/job/ncm-cdispd-pr-builder/15/

jrha commented 8 years ago

LGTM, what is the ICLIST?

stdweird commented 8 years ago

it's the list of components to be invoked

jouvin commented 8 years ago

I am not sure to agree with the change, please let me some time to review it... ICLIST is critical to ensure that failed components are retried the next time a profile is received: it used to be the intent but not the case and this was leaving systems half configured after a config module failure. I try to have a look this evening.

jouvin commented 8 years ago

@stdweird I looked in more details to your suggested mod and it seems ok to me. Have you carefully checked what happens it the first run of components after the initial startup fails to ensure that we don't have an undefined variable left somewhere?

stdweird commented 8 years ago

@jouvin the initial ICLIST is initialised at https://github.com/quattor/ncm-cdispd/blob/master/src/main/scripts/ncm-cdispd#L717, that looks clean enough to me and well before any of the changes in this PR come into play.