AutomatedProcessImprovement / waiting-time-analysis

Waiting time analysis of activity transitions in event logs.
1 stars 2 forks source link

Blank screen dashboard #37

Closed katelashkevich closed 1 year ago

katelashkevich commented 2 years ago
  1. Uploaded the log "Device Repair reduced.csv" at 12:25 and 12:29 Sept 13 (twice in a row). The log includes only the minimum data required (no additional columns that are not used by the algorithm). Got a blank screen dashboard. No error message or query ID was displayed.

Device Repair reduced.csv

  1. Uploaded the log "Device Repair.csv" at 12:34 and 12:36 Sept 13 (twice in a row). The log includes the minimum data required AND additional columns that are not used by the algorithm. All additional columns were mapped as "Other." Got a blank screen dashboard (at 12:35 and 12:36). No error message or query ID was displayed.

Device Repair.csv

  1. Uploaded the log "Loan Origination (2).csv" at 12:54 Sept 13. The log includes the minimum data required AND additional columns that are not used by the algorithm. All additional columns were mapped as "Other." Got a blank screen dashboard almost immediately (at 12:54). No error message or query ID was displayed.

Loan Origination (2).csv

iharsuvorau commented 2 years ago

Device Repair Log

WTA Failure

For first two logs, https://github.com/AutomatedProcessImprovement/waiting-time-analysis/files/9555650/Device.Repair.reduced.csv and https://github.com/AutomatedProcessImprovement/waiting-time-analysis/files/9555651/Device.Repair.csv, WTA CLI tool fails during the batch analysis. We use the package from this repository for that, https://github.com/AutomatedProcessImprovement/batch-processing-analysis. I remember we had this issue with @david-chapela before, it was related to the progress bar not displaying correctly in the R script when it cannot find more than 2 batches. This is a short excerpt from the traceback:

Error in txtProgressBar(min = 2, max = length(candidates), style = 3) : 
  must have 'max' > 'min'

I'll discuss with David what to do in that case, when he's back in office the next week. We can check for this exception in either of the packages (wta or batch analysis), put some values into "batch_instance_enabled" and "enabled_time" columns when batch detection fails, and proceed without throwing the exception.

Frontend improvement

Regarding "Got a blank screen dashboard. No error message or query ID was displayed.", probably, we can improve it if we display at least a status of the running job "pending", "running" or "failed". @JonasBerx, how does it work at the moment? Do we have a publicly available URL to try the front-end? The response from WTA Service for the logs above looks like this:

{
    "id": "a7545e07-55e6-11ed-98f8-0242ac140002",
    "status": "failed",
    "error": "error executing analysis: exit status 1; stderr: Traceback (most recent call last):\n  File \"/usr/src/app/venv/bin/wta\", line 8, in <module>\n    sys.exit(main())\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 1130, in __call__\n    return self.main(*args, **kwargs)\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 1055, in main\n    rv = self.invoke(ctx)\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 1404, in invoke\n    return ctx.invoke(self.callback, **ctx.params)\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 760, in invoke\n    return __callback(*args, **kwargs)\n  File \"/usr/src/app/src/wta/cli.py\", line 27, in main\n    result: TransitionsReport = run(log_path=log_path, parallel_run=parallel, log_ids=log_ids)\n  File \"/usr/src/app/src/wta/main.py\", line 37, in run\n    log = log[[log_ids.case, log_ids.activity, log_ids.resource, log_ids.start_time, log_ids.end_time]]\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/pandas/core/frame.py\", line 3511, in __getitem__\n    indexer = self.columns._get_indexer_strict(key, \"columns\")[1]\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py\", line 5782, in _get_indexer_strict\n    self._raise_if_missing(keyarr, indexer, axis_name)\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py\", line 5845, in _raise_if_missing\n    raise KeyError(f\"{not_found} not in index\")\nKeyError: \"['_id'] not in index\"\n",
    "event_log": "http://193.40.11.233/assets/results/a7545e07-55e6-11ed-98f8-0242ac140002/event_log.csv",
    "event_log_md5": "f994b5b939327dcf928420344e86690e",
    "created_at": "2022-10-27T14:01:05.65342141+03:00",
    "finished_at": "2022-10-27T14:01:06.819315414+03:00",
    "column_mapping": {
        "activity": "Activity",
        "case": "_id",
        "end_timestamp": "end_time",
        "resource": "Resource",
        "start_timestamp": "start_time"
    }
}

Could we display the status and maybe error message when it's provided for debugging purposes?

Loan Origination Log

I'm still running this log. It's a bigger one than we usually used in testing, so it takes time. I'll write about it in a separate comment later.

JonasBerx commented 2 years ago

No public urls, client is currently only local

When the button is pressed you get a snackbar saying analysis started, and the button that was pressed will remain in loading animation until the analysis is done. The blank screen comes from something breaking in either front or backend. I will investigate that when I have a spare moment.

iharsuvorau commented 2 years ago

Loan Origination Log

This one has multiple issues:

JonasBerx commented 1 year ago

Hey,

So i've added catches for the errors and a check if a duplicate is an error as well. The blank screens are not appearing anymore for me so i think thats fixed.

A small QoL change i made is whenever its polling for the status if its still "running" it will pop up a message again so the user knows that the tool is still working :)) @katelashkevich

On a side note, Loan origination is still taking a long time so i cant tell for sure if that one is working as well. Device repair seems to have some issues still.

Let me know if there are any issues after this update.

iharsuvorau commented 1 year ago

@katelashkevich @JonasBerx The logs in this issue caused multiple problems for the analysis CLI tool:

The batching analysis issue is the longest to fix because it requires a team effort and changes across multiple packages that we develop internally.

Loan Origination log takes about 20-40 min on my machine and may take longer for the production server because it's even less performant.

Device log has 0 batches, so it fails for now too until we fix the batching analysis part.

iharsuvorau commented 1 year ago

@katelashkevich The issues should be solved now. The Device Repair (reduced) log output is here http://193.40.11.233/jobs/6d7763f1-5baf-11ed-bfad-0242ac120002. I've started Loan Origination now, because it takes about 30+ min to run, tomorrow, we can check the result at http://193.40.11.233/jobs/db98ead9-5baf-11ed-bfad-0242ac120002.