Closed katelashkevich closed 1 year ago
For first two logs, https://github.com/AutomatedProcessImprovement/waiting-time-analysis/files/9555650/Device.Repair.reduced.csv and https://github.com/AutomatedProcessImprovement/waiting-time-analysis/files/9555651/Device.Repair.csv, WTA CLI tool fails during the batch analysis. We use the package from this repository for that, https://github.com/AutomatedProcessImprovement/batch-processing-analysis. I remember we had this issue with @david-chapela before, it was related to the progress bar not displaying correctly in the R script when it cannot find more than 2 batches. This is a short excerpt from the traceback:
Error in txtProgressBar(min = 2, max = length(candidates), style = 3) :
must have 'max' > 'min'
I'll discuss with David what to do in that case, when he's back in office the next week. We can check for this exception in either of the packages (wta or batch analysis), put some values into "batch_instance_enabled" and "enabled_time" columns when batch detection fails, and proceed without throwing the exception.
Regarding "Got a blank screen dashboard. No error message or query ID was displayed.", probably, we can improve it if we display at least a status of the running job "pending", "running" or "failed". @JonasBerx, how does it work at the moment? Do we have a publicly available URL to try the front-end? The response from WTA Service for the logs above looks like this:
{
"id": "a7545e07-55e6-11ed-98f8-0242ac140002",
"status": "failed",
"error": "error executing analysis: exit status 1; stderr: Traceback (most recent call last):\n File \"/usr/src/app/venv/bin/wta\", line 8, in <module>\n sys.exit(main())\n File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 1130, in __call__\n return self.main(*args, **kwargs)\n File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 1055, in main\n rv = self.invoke(ctx)\n File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 1404, in invoke\n return ctx.invoke(self.callback, **ctx.params)\n File \"/usr/src/app/venv/lib/python3.10/site-packages/click/core.py\", line 760, in invoke\n return __callback(*args, **kwargs)\n File \"/usr/src/app/src/wta/cli.py\", line 27, in main\n result: TransitionsReport = run(log_path=log_path, parallel_run=parallel, log_ids=log_ids)\n File \"/usr/src/app/src/wta/main.py\", line 37, in run\n log = log[[log_ids.case, log_ids.activity, log_ids.resource, log_ids.start_time, log_ids.end_time]]\n File \"/usr/src/app/venv/lib/python3.10/site-packages/pandas/core/frame.py\", line 3511, in __getitem__\n indexer = self.columns._get_indexer_strict(key, \"columns\")[1]\n File \"/usr/src/app/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py\", line 5782, in _get_indexer_strict\n self._raise_if_missing(keyarr, indexer, axis_name)\n File \"/usr/src/app/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py\", line 5845, in _raise_if_missing\n raise KeyError(f\"{not_found} not in index\")\nKeyError: \"['_id'] not in index\"\n",
"event_log": "http://193.40.11.233/assets/results/a7545e07-55e6-11ed-98f8-0242ac140002/event_log.csv",
"event_log_md5": "f994b5b939327dcf928420344e86690e",
"created_at": "2022-10-27T14:01:05.65342141+03:00",
"finished_at": "2022-10-27T14:01:06.819315414+03:00",
"column_mapping": {
"activity": "Activity",
"case": "_id",
"end_timestamp": "end_time",
"resource": "Resource",
"start_timestamp": "start_time"
}
}
Could we display the status and maybe error message when it's provided for debugging purposes?
I'm still running this log. It's a bigger one than we usually used in testing, so it takes time. I'll write about it in a separate comment later.
No public urls, client is currently only local
When the button is pressed you get a snackbar saying analysis started, and the button that was pressed will remain in loading animation until the analysis is done. The blank screen comes from something breaking in either front or backend. I will investigate that when I have a spare moment.
This one has multiple issues:
413 Request Entity Too Large Error
. The default limit for nginx was something about 20M. It was the first error that the frontend was facing.
maximum recursion depth exceeded
error. I'll report back later.Hey,
So i've added catches for the errors and a check if a duplicate is an error as well. The blank screens are not appearing anymore for me so i think thats fixed.
A small QoL change i made is whenever its polling for the status if its still "running" it will pop up a message again so the user knows that the tool is still working :)) @katelashkevich
On a side note, Loan origination is still taking a long time so i cant tell for sure if that one is working as well. Device repair seems to have some issues still.
Let me know if there are any issues after this update.
@katelashkevich @JonasBerx The logs in this issue caused multiple problems for the analysis CLI tool:
The batching analysis issue is the longest to fix because it requires a team effort and changes across multiple packages that we develop internally.
Loan Origination log takes about 20-40 min on my machine and may take longer for the production server because it's even less performant.
Device log has 0 batches, so it fails for now too until we fix the batching analysis part.
@katelashkevich The issues should be solved now. The Device Repair (reduced) log output is here http://193.40.11.233/jobs/6d7763f1-5baf-11ed-bfad-0242ac120002. I've started Loan Origination now, because it takes about 30+ min to run, tomorrow, we can check the result at http://193.40.11.233/jobs/db98ead9-5baf-11ed-bfad-0242ac120002.
Device Repair reduced.csv
Device Repair.csv
Loan Origination (2).csv