Closed imnotjames closed 4 years ago
I don't think this is a very good change considering things like db locks and other such items that may not be thread safe and therefore under a single thread.
I'm not sure I understand.
The previous implementation spawned a new thread for every single request that came in unless @hook.singlethreaded
was used. This caused database locks and simultaneous writes to the same file to happen inadvertently (while, at the same time, not significantly improving performance) unless you applied that hook.
What this does is that it simplifies the threading by essentially applying @hook.singlethreaded
to every plugin - essentially setting up a thread pool management system, where a single thread is assigned to each plugin.
This protects us from a poorly written plugin from taking up a huge amount of stack space, as well - plugins which take long to respond can be abused by requesting the command multiple times to add a new thread to the stack.
Does this address your concern?
Have you tested this across multiple IRC servers and triggering the plugin from all IRC servers at the same time?
That's a good point! I've checked in the following cases:
I think there could be performance issues if you were to have a VERY VERY high-throughput bot - eg some commands for a slow plugin would be queued behind that same plugin's commands. However, I think this is preferable over running the bot out of stack space / memory.
I was pointing out a timing attack in singlethread plugins across servers, all of those tests minus the last one have no bearing on response times.
I'd also advise that you test failures within threads, do threads start again if a plugin has an error in it (eg, remote host is down for an api query and the plugin times out) does calling the plugin again actual start the plugin again or is left that way (unusable).
On Tue., 11 Sep. 2018, 12:13 pm James Ward, notifications@github.com wrote:
That's a good point! I've checked in the following cases:
- Multiple IRC servers with the same nick.
- The Same IRC server with a different nick but different channels.
- The Same IRC with a different nick but the same channel.
I think there could be performance issues if you were to have a VERY VERY high-throughput bot - eg some commands for a slow plugin would be queued behind that same plugin's commands. However, I think this is preferable over running the bot out of stack space / memory.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rmmh/skybot/pull/200#issuecomment-420122802, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZoEcECMIonGcVzndihSTDA-tuHqTctks5uZxxKgaJpZM4Wibl7 .
Also, I feel the amount of stack memory is fine here. I think you'd trigger anti spam from the IRC server before you'd blow the bot up. We can also just raise the stack memory amount.
Yes, I have tested against exceptions in plugins. This catches them just like it always has.
So any failure will cause that plugin to be unusable from that point onwards?
No, the original behavior for @hook.singlethreaded
is to catch the exception, emit a trace, and continue execution skipping the command.
This is, essentially, the same action taken as the other code path, where a new thread is spawned for every response.
The only difference is that instead of utilizing threads as our queueing mechanism, we have an actual queue backing our requests.
I'd double check that since I remember that it leaves the plugin no longer able to run.
I had tested it with the following dummy plugin:
import random
from util import hook
@hook.command
def error(inp):
if random.choice([ True, False ]):
raise ValueError('error')
return 'no error'
This causes the following output to emit:
>>> u'PRIVMSG #bot-testing :edward: no error'
Traceback (most recent call last):
03:15:16 #bot-testing <edward> .error
File "core/main.py", line 117, in start
self.run(self.func, input)
File "core/main.py", line 96, in run
out = func(input.inp)
File "plugins/error.py", line 7, in error
raise ValueError('error')
ValueError: error
03:15:17>>> u'PRIVMSG #bot-testing :edward: no error'
#bot-testing <edward> .error
03:15:18 #bot-testing <edward> .error
Traceback (most recent call last):
File "core/main.py", line 117, in start
self.run(self.func, input)
File "core/main.py", line 96, in run
out = func(input.inp)
File "plugins/error.py", line 7, in error
raise ValueError('error')
ValueError: error
03:15:18 #bot-testing <edward> .error
>>> u'PRIVMSG #bot-testing :edward: no error'
Traceback (most recent call last):
File "core/main.py", line 117, in start
self.run(self.func, input)
File "core/main.py", line 96, in run
out = func(input.inp)
File "plugins/error.py", line 7, in error
raise ValueError('error')
ValueError: error
03:15:20 #bot-testing <edward> .error
>>> u'PRIVMSG #bot-testing :edward: no error'
As you can see, it is calling the Handler.run
per the trace, and is able to continue even when an error has emitted.
The case of a hanging plugin could cause the plugin to stop responding, however. I'd consider that case a benefit of this approach - we won't have threads hanging with new threads starting up just to hang as well.
If you can think of an edge case which I should test, please let me know! Maybe this isn't the kind of issue you were implying.
Started using the higher level threading API, and I found that there MAYBE would be some edge case where a reload could theoretically have a period of time where we would dispatch against a function but then remove the thread for that function before the input could be applied.
I don't think this simplifies the code more than it hinders it with new potential race conditions. Only a few plugins actually use singlethread, and especially for slow APIs (like WolframAlpha), having the concurrent requests is worth it.
I think we should circle back to this once we're on Python 3 and potentially use concurrent.futures to reimplement the singlethread behavior.
Hmm.. okay, I'm fine with that. However, I'm curious - what new race conditions does it introduce?
Right now, every command has a new thread spun up for it which accesses all the globals, and then the command thread dies as soon as that completes. With that, you end up with an unlimited number of threads (except bound by stack).
By simplification, I meant that I removed a separate path through the code. It would also remove many race conditions, wouldn't it?
this makes it so all commands have their own thread, instead of spawning endless threads that spin up and die. This means a simplification of logic when dealing with creating plugins and how they interact with the rest of the bot. this should benefit us because the bot isn't doing any sort of management of a thread pool.