Blank screen dashboard - Githubissues

katelashkevich commented 2 years ago

Uploaded the log "Device Repair reduced.csv" at 12:25 and 12:29 Sept 13 (twice in a row). The log includes only the minimum data required (no additional columns that are not used by the algorithm). Got a blank screen dashboard. No error message or query ID was displayed.

Device Repair reduced.csv

Uploaded the log "Device Repair.csv" at 12:34 and 12:36 Sept 13 (twice in a row). The log includes the minimum data required AND additional columns that are not used by the algorithm. All additional columns were mapped as "Other." Got a blank screen dashboard (at 12:35 and 12:36). No error message or query ID was displayed.

Device Repair.csv

Uploaded the log "Loan Origination (2).csv" at 12:54 Sept 13. The log includes the minimum data required AND additional columns that are not used by the algorithm. All additional columns were mapped as "Other." Got a blank screen dashboard almost immediately (at 12:54). No error message or query ID was displayed.

Loan Origination (2).csv

iharsuvorau commented 2 years ago

Device Repair Log

WTA Failure

For first two logs, https://github.com/AutomatedProcessImprovement/waiting-time-analysis/files/9555650/Device.Repair.reduced.csv and https://github.com/AutomatedProcessImprovement/waiting-time-analysis/files/9555651/Device.Repair.csv, WTA CLI tool fails during the batch analysis. We use the package from this repository for that, https://github.com/AutomatedProcessImprovement/batch-processing-analysis. I remember we had this issue with @david-chapela before, it was related to the progress bar not displaying correctly in the R script when it cannot find more than 2 batches. This is a short excerpt from the traceback:

Error in txtProgressBar(min = 2, max = length(candidates), style = 3) : 
  must have 'max' > 'min'

I'll discuss with David what to do in that case, when he's back in office the next week. We can check for this exception in either of the packages (wta or batch analysis), put some values into "batch_instance_enabled" and "enabled_time" columns when batch detection fails, and proceed without throwing the exception.

Frontend improvement

Regarding "Got a blank screen dashboard. No error message or query ID was displayed.", probably, we can improve it if we display at least a status of the running job "pending", "running" or "failed". @JonasBerx, how does it work at the moment? Do we have a publicly available URL to try the front-end? The response from WTA Service for the logs above looks like this:

{
    "id": "a7545e07-55e6-11ed-98f8-0242ac140002",
    "status": "failed",
    "error": "error executing analysis: exit status 1; stderr: Traceback (most recent call last):\n  File \"/usr/src/app/venv/bin/wta\", line 8, in <module>\n    sys.exit(main())\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 1130, in __call__\n    return self.main(*args, **kwargs)\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 1055, in main\n    rv = self.invoke(ctx)\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 1404, in invoke\n    return ctx.invoke(self.callback, **ctx.params)\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 760, in invoke\n    return __callback(*args, **kwargs)\n  File \"/usr/src/app/src/wta/cli.py\", line 27, in main\n    result: TransitionsReport = run(log_path=log_path, parallel_run=parallel, log_ids=log_ids)\n  File \"/usr/src/app/src/wta/main.py\", line 37, in run\n    log = log[[log_ids.case, log_ids.activity, log_ids.resource, log_ids.start_time, log_ids.end_time]]\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/pandas/core/frame.py\", line 3511, in __getitem__\n    indexer = self.columns._get_indexer_strict(key, \"columns\")[1]\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py\", line 5782, in _get_indexer_strict\n    self._raise_if_missing(keyarr, indexer, axis_name)\n  File \"/usr/src/app/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py\", line 5845, in _raise_if_missing\n    raise KeyError(f\"{not_found} not in index\")\nKeyError: \"['_id'] not in index\"\n",
    "event_log": "http://193.40.11.233/assets/results/a7545e07-55e6-11ed-98f8-0242ac140002/event_log.csv",
    "event_log_md5": "f994b5b939327dcf928420344e86690e",
    "created_at": "2022-10-27T14:01:05.65342141+03:00",
    "finished_at": "2022-10-27T14:01:06.819315414+03:00",
    "column_mapping": {
        "activity": "Activity",
        "case": "_id",
        "end_timestamp": "end_time",
        "resource": "Resource",
        "start_timestamp": "start_time"
    }
}

Could we display the status and maybe error message when it's provided for debugging purposes?

Loan Origination Log

I'm still running this log. It's a bigger one than we usually used in testing, so it takes time. I'll write about it in a separate comment later.

JonasBerx commented 2 years ago

No public urls, client is currently only local

When the button is pressed you get a snackbar saying analysis started, and the button that was pressed will remain in loading animation until the analysis is done. The blank screen comes from something breaking in either front or backend. I will investigate that when I have a spare moment.

iharsuvorau commented 2 years ago

Loan Origination Log

This one has multiple issues:

File size was too big for nginx and WTA Service returned 413 Request Entity Too Large Error. The default limit for nginx was something about 20M. It was the first error that the frontend was facing.
- [x] In this case, I can propose @JonasBerx to check not only for "status" and "error" fields in the response from WTA Service, but also for HTTP status code, if it's 4 or 5, then we can display that message to a user as well.
- [x] On WTA service side, I've increased the file size limit to 100M, so we don't have 413 error anymore.
Another issue is related purely to WTA CLI tool. So I'm looking into it now. We have maximum recursion depth exceeded error. I'll report back later.

JonasBerx commented 1 year ago

Hey,

So i've added catches for the errors and a check if a duplicate is an error as well. The blank screens are not appearing anymore for me so i think thats fixed.

A small QoL change i made is whenever its polling for the status if its still "running" it will pop up a message again so the user knows that the tool is still working :)) @katelashkevich

On a side note, Loan origination is still taking a long time so i cant tell for sure if that one is working as well. Device repair seems to have some issues still.

Let me know if there are any issues after this update.

iharsuvorau commented 1 year ago

@katelashkevich @JonasBerx The logs in this issue caused multiple problems for the analysis CLI tool:

[x] recursion limit reached in Python because of a relatively big log size and thousands of intervals to process per one transition
[x] batching analysis package fails when it doesn't discover batches
[x] Timedelta type overflow because the number was too big (e.g., for Loan Origination log, PT total was about 45 years, WT total was about 600 years, cycle time was about 700 years when converted to seconds or microseconds it's huge; I doubled checked with the boss, these numbers seem okay even though the log spans across only 6 months, there are 8k cases where one case takes on average 2-3 wks)

The batching analysis issue is the longest to fix because it requires a team effort and changes across multiple packages that we develop internally.

Loan Origination log takes about 20-40 min on my machine and may take longer for the production server because it's even less performant.

Device log has 0 batches, so it fails for now too until we fix the batching analysis part.

iharsuvorau commented 1 year ago

@katelashkevich The issues should be solved now. The Device Repair (reduced) log output is here http://193.40.11.233/jobs/6d7763f1-5baf-11ed-bfad-0242ac120002. I've started Loan Origination now, because it takes about 30+ min to run, tomorrow, we can check the result at http://193.40.11.233/jobs/db98ead9-5baf-11ed-bfad-0242ac120002.

AutomatedProcessImprovement / waiting-time-analysis

Blank screen dashboard #37

Device Repair Log

WTA Failure

Frontend improvement

Loan Origination Log

Loan Origination Log