CMSCompOps / WorkflowWebTools

https://workflowwebtools.readthedocs.io
1 stars 7 forks source link

Update Recommended Procedures #23

Closed dabercro closed 6 years ago

dabercro commented 7 years ago

The procedures given here: http://cms-comp-ops-tools.readthedocs.io/en/latest/workflowwebtools.html#module-WorkflowWebTools.procedures are generated by the PROCEDURES dictionary in the procedure.py module (https://github.com/CMSCompOps/WorkflowWebTools/blob/master/procedures.py).

I just added the cause column to the table. The 'cause' does not show up in the operator interface itself at the moment since the operator already sees the other exit codes in the example logs: https://github.com/CMSCompOps/WorkflowWebTools/blob/a30eb7f9627d80f1c29f992c803ec2b05af3b4ab/classifyerrors.py#L41 We can add it if desired, but that will make the page more cluttered.

Anyway, editing the text in the procedures or adding procedures for new error codes will update both the table and the operator interface. @mcremone @prozober (and anyone else interested in contributing), let me know if you have any questions. I will open this for now for the workflow team to review the procedures and make/propose updates.

paorozo commented 7 years ago

Hi Dan, it is possible to include multiple causes to an exit code, right?

dabercro commented 7 years ago

Yeah, but the causes on this table are added by hand. The workflow view itself displays the multiple "sub" exit codes to explain causes.

dabercro commented 7 years ago

It might be cleaner to make the cause values into lists and later join them with the ' |br| |br| ' string (a hacky way to make Sphinx separate the paragraphs). Would you prefer that?

paorozo commented 7 years ago

I guess we have a problem. We could have different causes, and a different procedure attached to each one of them. Should we reorganize the dictionary?

dabercro commented 7 years ago

So that's what the "Additional procedure" is for. If the regular expression in the last column is matched, it should return the contents of the match to the operator. This is basically parsing the logs for the operator. If the additional procedure merits a different action based on the matches, the operator will know.

Of course, for this to be fully useful, I need to get issue #24 working so that a given workflow is correctly matched.

paorozo commented 7 years ago

Sorry if it is a stupid question, but if we want to add, modify, or remove procedures, what do I need to do? to send a pull request to WorkflowWebTools, right? Of course I can do it that way, but it would be great if we have a more interactive manner to manage those procedures (that's why we were asking for a web interface to manage this info). Do you think it would be possible?

dabercro commented 7 years ago

If we want reasons to end up in the documentation, it has to get to readthedocs through GitHub. So for the purpose of this issue, we should make PRs.

Ultimately, we want the tool to learn the reasons and procedure on its own from training data. The reason boxes at the bottom (after clicking the "Add Reason" button) is the place to interactively place modified causes. The submitted actions themselves should result in modified procedures. Again, this is within the tool itself and not the central documentation. We have very little data to train on and no model implementation at the moment, so I'm falling back to the documented procedures.

I think once we have data to train on, the correct change to documentation would be to describe how the training works, and get rid of the procedures table entirely. At that point, the operators should rely on the training results more than a static table of causes and procedures.

paorozo commented 7 years ago

Dan, you are completely right! Thanks, I needed this enlightenment. Now, we should discuss the way the actions, exit codes, even logs, and results are going to be stored, right? Where do you think it is the better place to do it?

dabercro commented 7 years ago

Right now, everything except for the error logs is stored in a database on the server. I might revisit the format since we're grouping workflows (not just tasks) more.