sopel-irc / sopel

:robot::speech_balloon: An easy-to-use and highly extensible IRC Bot framework. Formerly Willie.
https://sopel.chat
Other
951 stars 402 forks source link

Allow processing and triggering on of Willie's output #117

Closed embolalia closed 11 years ago

embolalia commented 12 years ago

Add a mechanism to allow Willie's output to be run back through and trigger the appropriate functions. To preserve the loop prevention, there will need to be some explicitness to this.

Elad's suggestion was to have a Boolean callable.selftrigger, which indicates if the callable is able to be triggered by a Willie-sent message.

I wonder if, rather than or in addition to that, re-calling should be explicit by the outputting function, and the line should begin processing after the self-exclusion is already handled (i.e, Willie ignores itself unless a module excplicitly requests that an output line be reprocessed).

elad661 commented 12 years ago

I got it to work, kinda, but it's problematic with url.py for example because it'll loop on its own output until loop prevention is called...

Will need to think better on how we make this so a callable can't trigger itself like that... difficult.

lramati commented 12 years ago

what if the module can define a dict of extra values to be passed on to the called module? that way for url.py it could pass {self-triggered: True} and just check for that before calling itself. Or, you could have the calling be a command from WrappedWillie that can be told to use the line output by willie as input, or you could define your own line to be processed (in url.py's case, just override the default line with the same thing, just minus the URL or something)

lramati commented 12 years ago

some implementation ideas for the first thing, have dispatch() take a dict which defaults to {} that gets fed to trigger by doing

for name, value in dict.iteritems():
    trigger.__dict__[name] = value
elad661 commented 12 years ago

No. Dispatch starts each module in a new thread. Having such thing in dispatch will cause a lot of race conditions (or blocking locks which willl slow everything down)

I'll take a poke at this again later.

embolalia commented 12 years ago

I think it makes the most sense for the re-trigger to be explicit. We only really care about a few functions' output being reprocessed. If re-processing is on explicit request only, it will greatly reduce the number of loop issues with this.

I think it would probably be relatively easy to call a function which creates a new Trigger (a whole new instance, while still within the calling function's thread), and then inject that back into the normal flow (probably just after it checks the nick of the trigger against its own). As far as I know, only a very small number of functions ever need to re-process their output (is it anything other than remind and tell?), Given that, and given how easily a loop condition can be created, I think this approach makes the most sense.

elad661 commented 12 years ago

having self_triggered var in Trigger would help, plus, I think every module that uses callable.self_trigger = True should simply have it's own checks to make sure it doesn't repeat itself.

eg, in url.py, we would want to check if the message we are about to send is the same message that triggered us, and if that's the case, don't send it.

It's easier to implement @embolalia's approach.

lramati commented 12 years ago

to clarify, this is so a command can reprocess its own output? or so that it can ask for anything to reprocess its output? because if the first, why cant it just edit its trigger, and call itself all over again?

lramati commented 12 years ago

is there a list of commands that will be reworked to reprocess themselves somewhere? because i suggest sed

elad661 commented 12 years ago

No, that would make absolutely no sense.

This issue is about commands processing outputs of other commands, for example url parsing the output of tell.

lramati commented 12 years ago

then your idea for blocking output is only a partial fix.

elad661 commented 12 years ago

@firerogue no, you clearly don't understand the issue in hand. I'm going to have to ask you to stop interfering, because your comments are essentially just noise.

embolalia commented 11 years ago

I recently noticed another use case for reprocessing:

If someone posts a bit.ly to a youtube link, or any other link processed by a module other than url, it will get url.py's response and not the other module's response, because url.py responds to any url and follows the redirects, and any other module only responds to specific urls, with no way of knowing if a short url redirects to one of those specific urls. So somehow, one use case would be to expand the URL before any modules process anything, and then send it through. I don't know if this could be done in the same mechanism as the one we were talking about for e.g. tell.

embolalia commented 11 years ago

Thinking of it, there are two separate but related cases here:

  1. Output reprocessing. In this, a callable wants its output (or perhaps an arbitrary string?) to be run through other callables, and have the output of said callables be sent appropriately.
  2. Input preprocessing. In this, a callable wants to intercept incoming messages and modify them, before other callables get their hands on them.

I propose the following for each case (all names for things are provisional and open for comment):

  1. Reprocessing: A function is added to the Willie class called something like reprocess, which is called explicitly by the callable (e.g. the result of tell). The easiest way to do this would be for reprocess to take a string. It would also need to take a channel, and possibly also an event, in some way. Perhaps the easiest way would be to make it take a string and a trigger (that is, the trigger that started the calling function, perhaps inferred like in say). Other callables could define an attribute, like say accept_reprocess, which can be set to True or False. I'm not sure which would be default. reprocess would create a new Trigger and dispatch it as usual, but only to those with accept_reprocess set to True.
  2. Preprocessing: Modules can define special functions, denoted either by name or a special attribute, which take a Trigger (or possibly Origin, depending on implementation). At some point prior to matching input against callables' rules, all of these functions are run, one at a time and without threading. They mutate the object they take, giving it a new attribute with the modified value. The "actual" message should be preserved, since I can see some modules wanting it. I'm not sure yet whether the mutated or actual message should become the apparent message (i.e. the result of unicode(trigger)), but the mutated message should be the one matched against the callables' rules.

tldr The questions that need to be answered about functionality before starting implementation are:

embolalia commented 11 years ago

Regarding reprocessing, I have a (lazy) possible solution: callback functions. For example, a module which does url parsing (youtube, reddit, et al) already registers a regex in willie.memory, which the titling in url.py uses to ignore things that are already covered. If a callback function is associated with that regex, url could then call that in situations where it wants to re-process a url. Of course, each module would directly process urls in normal situations, as it does now. But .title, modified as was requested in issue #219 to show the title information of the last seen url, could match the url against the list of regexes and call the callback function on a match. This way, the system remains modular, there's no real risk of a loop, and we don't have to put real thought into reprocessing the entire trigger.

tldr Reprocessing will be explicit by user's command. Each general thing for which reprocessing is done (like urls in the example above) will define such a command, and, if such needs to span multiple modules, a standard for storing callbacks in willie.memory.

embolalia commented 11 years ago

The solution mentioned in my last comment is probably the best we're going to come up with, so I'm closing this.