dmwm / CRABServer

15 stars 37 forks source link

Add ChildWorker module #8527

Closed novicecpp closed 2 days ago

novicecpp commented 4 days ago

Fix https://github.com/dmwm/CRABServer/issues/8428

New ChildWorker module to spawn the child process from slave to run work() (for example, handleNewTask in Handler.py#L153).

The module can handle timeout (via SIGALARM), coredump, normal error. Then, propagate errors in form of exception back to slave, and to the caller properly.

This mode(?) is behind the feature flags config.FeatureFlags.childWorker, can enable it from TW config file. The TW config will have additional lines:

config.section_("FeatureFlags")
config.FeatureFlags.childWorker = True
config.FeatureFlags.childWorkerTimeout = 3600 # 1 hours
cmsdmwmbot commented 4 days ago

Jenkins results:

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2031/artifact/artifacts/PullRequestReport.html

cmsdmwmbot commented 4 days ago

Jenkins results:

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2032/artifact/artifacts/PullRequestReport.html

belforte commented 4 days ago

shouldn't we then move to FeatureFlags config section other parameters related to how it works internally ? nslaves, polling, max_retry, retry_interval...

novicecpp commented 4 days ago

I put it behind feature flags so we can easily turn it off and purge it later if it does not work. I plan to move to the TaskWorker section later if we enable it permanently.

And nslaves, polling, max_retry, retry_interval... variable does not affect how the ChildWorker works except SequentialWorker where it needs to fall back to a single process for debugging with pdb.

I can move it back to the usual TaskWorker section if you do not like it.

belforte commented 4 days ago

Beautiful !! Thanks. Yes, we can unify config. parameters later.

Only one comment, just like you pass logger.name around, could there be a simple way to pass also the logger formatter string ? Add to future /src/utils/TW/twUtils.py ?

belforte commented 4 days ago

good point (about sequential test)

cmsdmwmbot commented 4 days ago

Jenkins results:

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2033/artifact/artifacts/PullRequestReport.html

novicecpp commented 2 days ago

Sorry Stefano, forgot to answer your question.

Only one comment, just like you pass logger.name around, could there be a simple way to pass also the logger formatter string ? Add to future /src/utils/TW/twUtils.py ?

I do not know the good way, but now I do not need it anymore (see the comments in module doc).

belforte commented 2 days ago

thanks for the explaining the logger situation !