inspirehep / inspire-next

The INSPIRE repo.
https://inspirehep.net
GNU General Public License v3.0
59 stars 69 forks source link

holdingpen: inconsistent display / status #2946

Open ksachs opened 6 years ago

ksachs commented 6 years ago

Current Behavior

https://labs.inspirehep.net/holdingpen/779871 is 'on hold' in detailed view is 'CORE' in brief display actually it might be in error

    "approved": true, 
    "callback_result": {
     "error_message": "   Stage 2 failed: ERROR: while elaborating FFT tags: Error when downloading http://labs.inspirehep.net/api/files/900fe8d7-5f99-4ab8-abc0-5e2552158f11/can365_0.png into /opt/cds-invenio/var/tmp/bibdocfile_ed7Yqu.png: http://labs.inspirehep.net/api/files/900fe8d7-5f99-4ab8-abc0-5e2552158f11/can365_0.png seems to be empty", 
      "recid": 1635026, 
      "success": false
    }, 

screenshot from 2017-11-13 10 33 43

Screenshots (if appropriate):

david-caro commented 6 years ago

That is a record that was harvested and halted:

      {
        "doc": "Halt the workflow object, action=hep_approval, message=Submission halted for curator approval.", 
        "name": "halt_record", 
        "nicename": "\"Submission halted for curator approval.\"", 
        "parameters": [], 
        "time": "2017-11-08 03:12:15.611979"
      }, 

manually accepted in the next morning (see the time):

      {
        "doc": "BREAK: args(); kwargs().", 
        "name": "BREAK", 
        "nicename": "BREAK: args(); kwargs().", 
        "parameters": [], 
        "time": "2017-11-09 09:22:19.865856"
      }, 

but failed at the last step of the workflow when sending to legacy:

      {
        "doc": "Get the MARCXML from the model and ship it.\n\n    If callback_url is set the workflow will halt and the callback is\n    responsible for resuming it.\n    ", 
        "name": "send_robotupload", 
        "nicename": "Get the MARCXML from the model and ship it.", 
        "parameters": [], 
        "time": "2017-11-09 09:22:20.228502"
      }, 

then was restarted in the evening (probably by me, thinking it was an eos issue):

      {
        "doc": "Make sure schema is set properly and resolve it.", 
        "name": "set_schema", 
        "nicename": "Make sure schema is set properly and resolve it.", 
        "parameters": [], 
        "time": "2017-11-09 18:46:59.178746"
      }, 

And then went again through the whole process, and got halted again:

      {
        "doc": "Halt the workflow object, action=hep_approval, message=Submission halted for curator approval.", 
        "name": "halt_record", 
        "nicename": "\"Submission halted for curator approval.\"", 
        "parameters": [], 
        "time": "2017-11-09 18:51:24.145507"
      }

Clarified that, we are working on a better way to restart workflows that will improve this situation, currently the tasks themselves are not 100% idempotent, so there are some cases where the restart will force some unexpected paths.

StellaCh commented 6 years ago

is there any work pending for this issue or shall we close this?

michamos commented 6 years ago

The display of the status is not very good, this should be improved.