CMSCompOps / wtc-console

MIT License
1 stars 1 forks source link

Updating Workflow Errors in Manual Assistance #9

Closed dabercro closed 5 years ago

dabercro commented 5 years ago

At the meeting today, we saw that one of the workflows did not have the same errors in the new console as the old or Unified. This may be related to #2 but not quite, so here's a new issue. The problem seems to be that the error information for the workflow in question is almost a week old.

Here's the link to the relevant pages (with screenshots since the workflow will be acted on soon)

wtc-console: http://wc-dev.cern.ch/tasks?page=1&filter=task_EXO-RunIIFall18wmLHEGS-00331&orderBy=updated-desc screen shot 2019-01-10 at 5 55 40 pm

Unified page: https://cms-unified.web.cern.ch/cms-unified/report/pdmvserv_task_EXO-RunIIFall18wmLHEGS-00331__v1_T_181218_063716_1335 screen shot 2019-01-10 at 5 58 26 pm

Old console: https://vocms0113.cern.ch:80/seeworkflow2/?workflow=pdmvserv_task_EXO-RunIIFall18wmLHEGS-00331__v1_T_181218_063716_1335 (harder to get full overview in one shot, but here it is): screen shot 2019-01-10 at 5 59 23 pm

vlimant commented 5 years ago

that calls for a unification of the information source/collection https://github.com/CMSCompOps/WmAgentScripts/issues/374 maybe ?

dabercro commented 5 years ago

That would be a good idea. @phylsix has been working on uploading the data to CMSMONIT. This was going to be used by wtc-console and OSDroid.

vlimant commented 5 years ago

how do people get the information back from monit ? is that a trivial thing ?

dabercro commented 5 years ago

Not as trivial as a local MongoDB, but I'm sure it can be made trivial by someone who knows what's going on.

vargasa commented 5 years ago

New workflows are being fetched every minute, while updates are being done every 10 minutes:

https://github.com/CMSCompOps/wtc-console/blob/2e68a0c919e7609e67124c21160ab55e389c398b/src/djangoreactredux/celery.py#L15

For some reason wc-dev was not running the celery workers, maybe I turned them off at some point as this is my dev place. However, logs on vocms0116 show that workers have been up and running consistently since Feb/07. Could you try to check if you still see some of the differences you mentioned when using vocms0116?

dabercro commented 5 years ago

It looks like the workflows have recent information now. It's not perfectly synced, but this particular issue can be closed.