cms-PdmV / cmsPdmV

CERN CMS McM repository
4 stars 10 forks source link

McM validation tests checking multiple parameters at once #1132

Open sihyunjeon opened 7 months ago

sihyunjeon commented 7 months ago

Suggestions for McM validation tests

Is your feature related to a problem?

Not a "problem" but it would be nice to have multiple parameters checked at the same time. (I haven't ran McM validation for last one-two months so I am not sure if there has been any updates on this regard).

Describe the solution you'd like

McM validation reading all test results at the same time and fix the values at one go, not doing it one by one.

Current behavior

When we run McM validation, we check multiple parameters to be kept within the window, time/event, size/event, filter efficiency etc. are checked and it reruns the validation once more if the validation fails. For example, let's say time/event, size/event are given with (3s, 800kb), respectively as initial parameters in McM. McM runs validation and gets the test result (8s, 200kb) which are significantly different from above initial guesses. What McM does is first check "one" parameter, fix the value to average and then reruns the validation again. So if the first parameter that is checked is time/event, McM validation workflow changes the McM parameters to (8s, 800kb) instead of changing it to (8s, 200kb). And in the next run, McM validation will still fail because 200kb and 800kb have huge gap. After running the validation for no reason, the value would be tuned to (8, 200) This is somewhat waste of resources because we already have the size/event result from the first test but we are not using it.

Expected behavior

  1. McM validation test runs
  2. Validate all necessary parameters with one go (instead of escaping when one fails and triggering another validation job)
  3. Lesser number of test validation jobs is expected
sihyunjeon commented 7 months ago

Not sure if I described this in an understanding way.

Long story short, McM time/event and size/event could be checked with one job run and then it could be tuned with the first job output. Instead, the current behavior tunes the time/event from the 1st job, and then tunes the size/event in the 2nd job, and finally 3rd job will have no issues.

Let me know if this is not clear enough (or has been addressed already in last couple of months)

I think current behavior is like

if abs(given_time_per_event - test_time_per_event)/test_time_per_event > 0.5 :
    return error, test_time_per_event # given_time_per_event too different
elif abs(given_size_per_event - test_size_per_event)/test_size_per_event > 0.5 :
    return error, test_size_per_event

Would be nice to collect all the bad values and make it tune the value once like below

errors={}
if abs(given_time_per_event - test_time_per_event)/test_time_per_event > 0.5 :
    errors['time_per_event'] = test_time_per_event
if abs(given_size_per_event - test_size_per_event)/test_size_per_event > 0.5 :
    errors['size_per_event'] = test_size_per_event
return errors
lmoureaux commented 7 months ago

What you describe is correct, around automatic_scripts/validation/validation_control.py line 600