lukingroup / pylabnet

Client-server, python-based laboratory software
MIT License
17 stars 8 forks source link

Fixes Iss390: Logger crashing #414

Closed cknaut closed 11 months ago

cknaut commented 12 months ago

Summary

Replaced the widget-type holding the log messages in the logger from QTextBrowser to QPlainTextEdit (which is optimized for plain text). This increased the maximal size of the message in the logging window before crashing from 20Mio chars to >1000Mio chars (logfile size of 1GB!). This should hopefully increase the uptime of the logger during normal operation from few days few months.

Also fixed bug with log file chopping function.

Debugging Steps

This code demo/logger_stabilty_test/logger_DoS_attack.ipynb (deleted in final version) does replicate the logger crashing. It does not seem to be connected with a memory issue, as the memory usage seems to be low.

It does not look like the crashing can be mitigated by increasing the waitime between two consecutive logging statements. Interestingly, if the logger crashed, it can be force-quit and restarted. After restarting, the log-messages appear (if the DoS code is still running). This is indicative with the filling up of the textbox being the issue.

Let's log the number of logged chars before crashing occurs for different parameters:

n_sleep num_sentence n_lgs max_chars before crash
0.01 10000 100 18Mio
0.1 10000 100 15Mio
0.5 10000 100 18Mio
0.01 1000 10000 80Mio

Some progress: After changing the window type of the "terminal" and "buffer_terminal" from QTextBrowser to QPlainTextEdit (which is optimized for plain text), we can spam the logger much more without crashing:

n_sleep num_sentence n_lgs crash yes no number of chars
0.01 100000 100 no 500Mio
0.01 10000000 100 yes
0.01 1000000 100 yes
0.01 1000000 1000 yes after logfile was ~1GB --> 1000Mio chars

For log messages > 100000, we get this error: terminate called after throwing an instance of 'std::bad_alloc' --> We're actually running into a memory allocation problem. SHould be no problem though, since our single messages rarely will be > a couple 1000s of chars.

Maximium textsize in logger expanded from ~20Mio to 1000 Mio.

Testing some other cases

n_sleep num_sentence n_lgs crash yes no number of chars
0.01 10 100000 no but got impations 40Mio

Now trying to run two client in parall, on using the two settintgs

n_sleep num_sentence n_lgs
0.01 10 100000
0.01 1000000 100

The slower loop succesfully terminates --> multi client operation seems stable. Also checked if staticproxy is working, seems to be the case.

pieterjanstas commented 12 months ago

static proxy connected to Solomon master is slow now, not sure if it is due to the upgrades in this branch? Might also just be due to all the DNS attacks throughout the day and the master needs a fresh restart

pieterjanstas commented 11 months ago

restarted master on Solomon and everything seems okay. I'll merge this PR, but let's keep an eye out for if there is any degradation or lagging

pieterjanstas commented 11 months ago

Actually, it looks like you removed all confluence-related stuff, as well as the option to filter by lab, was this on purpose?

Screenshot 2023-09-22 at 10 24 52
cknaut commented 11 months ago

No, that was a merging issue (must not have pulled from master before). This should be corrected during the pull request.

On Fri, Sep 22, 2023, 10:26 Pieter-Jan Stas @.***> wrote:

Actually, it looks like you removed all confluence-related stuff, as well as the option to filter by lab, was this on purpose? [image: Screenshot 2023-09-22 at 10 24 52] https://user-images.githubusercontent.com/79099250/269967444-5a5aefec-ab0c-42b3-bed6-0c06287a5802.png

— Reply to this email directly, view it on GitHub https://github.com/lukingroup/pylabnet/pull/414#issuecomment-1731517202, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABF2QETXKDKPL4OQ7NWLHCDX3WNYVANCNFSM6AAAAAA5CGAG2M . You are receiving this because you authored the thread.Message ID: @.***>