Closed cmaclell closed 4 years ago
We have a new instance of this problem that suggests its related to left over actions after a correct done button press. Can we confirm this is a reliable piece of the issue?
This is not likely, an issue with AL_autorun.js since all of the logging parameters are passed to CTAT through query strings in the URL. Its more likely that there was an error parsing a particular message or the message wasn't sent at all by CTAT.
Here's an example log with the issue - it looks like it might occur after a done press, although I'm not sure if that's the issue: outer_loop_test_bktLog-2019-07-17-16_01_01.txt
So I've been able to replicate this issue on Windows, and not on Linux. It seems that the issue has to do with how quickly each operating system handles logging requests. Windows is taking its sweet time on a few things: 1) Resolving the localhost domain name (can be fixed by changing to 127.0.0.1) 2) Perhaps parsing the xml and formatting the data 3) Perhaps writing the data to disk
I've written a unit-test which flexes this issue in the /unittests folder. The only real solution to this is to write a more significant logger which can handle multithreaded asynchronous requests. If someone motivated wants to move over to Flask or something like that, I won't be able to take care of this until after the CHI deadline. If people are desperate, I imagine running a Linux server on AWS with a selenium browser would fix the issue. However, as the code gets faster I imagine Linux will also run into issues with the logging requests coming in too fast.
Additionally, I should mention that this issue is not just a matter of missing data in some of the rows. If you carefully count the number of recorded transactions you will see that some events simply were not recorded. So the issue cannot be solved by just removing the offending rows.
Is it possible that some of the platform dependent delay comes from re-opening and disposing of the file writer for every log request? Some cursory searching (admittedly really old: https://stackoverflow.com/questions/1842798/python-performance-on-windows) seems to suggest there are platform dependent speed issues with file I/O operations. If we just maintained a single writer object the whole time would the problem just go away?
Could be, I recreate the file handle every time so its worth a shot.
^ tried this today. Didn’t fix it.
@DannyWeitekamp is pretty sure this is fixed.
Occasionally columns in the output log are blank