labscript-suite-oldfinal1 / labscript_utils

Shared modules used by the labscript suite. Includes a graphical exception handler, debug tools, configuration management, cross platform filepath conversions, unit conversions and custom GUI widgets.
Other
0 stars 0 forks source link

Concurrent log handler causes unbearable slowdown #21

Closed philipstarkey closed 5 years ago

philipstarkey commented 6 years ago

Original report (archived issue) by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


BLACS is intolerably slow to do anything when using the concurrent log handler introduced to fix the logging bug. I suppose all that file opening and closing as the processes exchange locks is is just too much. Toggling a digital out takes ~1 second with all the logging we do, whereas with regular logging it takes ~30ms. This latency adds up everywhere in BLACS, reducing shot throughput and making everything you do a little sluggish.

So back to the drawing board with logging - I'll probably subclass a logging handler to send data over zeromq to a server a-la zlock, which was the original plan.

This sluggishness was observed in Windows 10 and is clearly gone if the logging is switched back to a regular FileHandler. I don't observe the problem on linux.

philipstarkey commented 6 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: philipstarkey).


If it's windows 10 only it could be something to do with the disk write caching settings and/or might be minimised by a SSD (if you don't have one already...). Just a thought! Not really a long term solution though (we don't have SSDs, not sure if we've updated to the new logging version though so maybe we haven't seen it).

philipstarkey commented 5 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


This is addressed by pull request #56. A logging server! What I was itching to do all along, and what will be likely necessary anyway once we have remote devices.

philipstarkey commented 5 years ago

Original comment by David Meyer (Bitbucket: dihm, GitHub: dihm).


FYI, we see this slowdown in our systems too using Windows 7 with SSDs. Programming with the concurrent logger is 0.75s, without is 0.12s.

philipstarkey commented 5 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


Thanks! if it's not disk write speed from the OS syncing disk caches, then it must just be an accumulation of a lot of system calls, and there might just be some inefficiency there in the way file locking is implemented on Windows. System calls are pretty expensive usually anyway, and since pull request #56 now does them in another thread instead of the main thread (zmq writing to a TCP socket happens in zmq's IO thread), BLACS might even run a little faster than before the concurrent logging handler.

philipstarkey commented 5 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


philipstarkey commented 5 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


This was fixed in pull request #56