Architectural changes for next Jasper version

Holzhaus commented 9 years ago

While thinking about the design of the next Jasper release I hit some stumble blocks that require architectural changes in Jasper.

Here a some ideas open for discussion. I already created a working sample implementation that covers these points:

Adding a config manager, that holds options inside sections. During initialization, every config option will be registered with a default value and a description; In case, the config key does not exist, the default value will be returned. If no config file exists, we can create a commented config skeleton that includes all available config options. If we want to keep populate.py's functionality, we can simply implement a function that iters through all registered options and prints description and default_value.
Adding a plugin manager. Every plugin can be loaded and activated. Different types of plugins (e.g STT engine or TTS engine) are categorized by inheriting from a category base class (e.g. AbstractSTTEngine). Plugins need a metadata file (ini-file format), that states name, slug, version, description, license, etc. Plugins can also be places into ~/.jasper/plugins so that every user can install his/her own plugins.
A new kind of conversation superclass. Jasper can have multiple unrelated conversation objects at once that handle input/output (e.g. via microphone/speakers, via sockets or something else like that). If they get input, they delegate this input to the respective plugins and output the replies to the user (e.g. if the user inputs something via microphone, the reply is played back on the loudspeakers; if the user inputs text via a terminal, the reply is printed to that terminal, if the user inputs text via a socket, the reply is also written to that socket connection, etc.). These different types of conversation subclasses could also be implemented as plugins.
Using command phrases instead of single words. We should provide means to parse phrases instead of single words. The status quo is that modules provide a WORDS constant. This works for very simple modules like "News" or "Weather", but let's say I wan't to switch my lights in different rooms on or off: Right now, this would require a lot of custom parsing code. Also, the current approach makes it impossible for STT engines to profit from grammar based vocabulary and also prevents us from making Jasper multi-lingual. I've refined the ideas from PR #134, so that Plugins can define whole phrases with placeholders (e.g. SWITCH {location} LIGHTS {state} or something like that)

Also have a look at this amateurish diagram:

Jasper architecture mockup

What is still missing in my sample implementation:

Adding an easy way to make your plugin multilingal using gettext. IMHO, plugins should include a locale folder with mo files for different languages and can be filtered based on their language. If the user set the language to French, but a plugin supports only english and spanish, it will be filtered out. If we're using command phrases instead of single words, it'll be easy to translate.

@crm416 @shbhrsaha Any thoughts/criticism/feedback?

shbhrsaha commented 9 years ago

Haha, love the drawing.

Good overall first impression. Charlie and I'll think about this for a few days and provide some more specific feedback! Thanks for putting a lot of thought into this.

shbhrsaha commented 9 years ago

Thanks for waiting-- we just finished up with exams on this end!

I'm excited about all of these ideas. Here's my quick reactions for discussion:

I love the idea of a config manager and getting rid of populate.py altogether. ~/.jasper/plugins would also be fantastic to have.
Initially the idea of supporting socket-based input struck me as feature creep, but I'm warming up to the idea. I think voice control still defines Jasper-- are there compelling applications for socket-based input?
Open to the idea of command phrases over single words. Even something as simple as mapping words to variables would probably be helpful for module developers.

So generally: I think these are good features! Eager to see the implementation you mentioned

Holzhaus commented 9 years ago

Concerning 2: Instead of using Sockets directly, I used the XMLRPC server built into Python. The usage example would be either to use it for testing or to create an Android App that uses the Google API to transcribe speech directly on the phone, then sends the command to Jasper and then outputs the answer on the phone again.

Concering 3: The good thing: You can still use a phrase consisting only of a single word (like "WEATHER"). So the simple modules will need to change the API calls, but the usage remains the same. More complex modules like MusicMode could stop running in their own "mode" entirely, but register a phrase like "Load playlist {playlist}" globally instead, which would simplify and speed up usage a lot.

I'll upload my example implementation soon.

Holzhaus commented 9 years ago

There it is: https://github.com/Holzhaus/jasper-plugin-draft The localization part is still missing.

Please have a look and tell me what you think.

shbhrsaha commented 9 years ago

Ah, OK. Yes, the applications for XMLRPC seem to be compelling use cases, so let's go forward with that too then.

Good point about global grammar registration-- I too would prefer that MusicMode not steal its own "mode".

The plugin draft looks fantastic. I like the inheritance from SpeechHandlerPlugin and .jasperplugin config files.

(Side note... you might've dealt with this already: I had to cast the slugify arguments in unicode() to get app.py to run) https://github.com/Holzhaus/jasper-plugin-draft/blob/master/core/configmanager.py#L48 https://github.com/Holzhaus/jasper-plugin-draft/blob/master/core/pluginmanager.py#L292

Holzhaus commented 9 years ago

Whoops, I accidently tested with Python3 instead of Python2. In Python3, all strings are unicode. I'll add that as soon as I finished my exams ;-)

What's still missing is a good way to implement the multi-language stuff. I'd like to use a directory structure like this, so that every plugins has the translations stored in it's own folder.

plugins/
   plugin1/
     plugin1.jasperplugin
     plugin1.py
     locale/
       en_US/
         plugin1.mo
       de_DE/
         plugin1.mo

Holzhaus commented 9 years ago

@shbhrsaha @crm416 After I finally finished both my exam period and the massive chillout period that necessarily followed it, I had some time to work on the plugin system draft. I now got multilanguage (based on gettext) working for speech handler plugins. Thus, most basic plugin stuff is finished. Please have a look at the respective git repo.

For testing purposes, you can change the default language in app.py to de_DE: All plugins that do not support that language should be rejected automatically.

The plugin structure has been simplified a bit:

plugins/
   plugin1/
     plugin1.jasperplugin
     plugin1.py
     languages/
       en_US.mo
       de_DE.mo

There still some remaining issues:

We still need to find a nice way to detect supported languages for TTS/STT engines BEFORE loading the plugin. Not sure this is possible at all, except defining it in the *.jasperplugin file. Any ideas?
What if a british guy uses Jasper? I predict that most plugins won't have a en_GB.mo file. Shall we reduce the available set of languages to just the base language (without the region, i.e. en.mo instead of en_US.mo)? Or shall we allow the language config key to be a comma-separated list of languages (ordered by preference). A third option: we define that plugins should always have the base language mo file (en.mo) and can optionally add the regional language file, so that the user can safely set his language to en_GB - if en_GB.mo does not exist, en.mo is used (But what is en? British English or American?).
Do we really want EventHandler (a.k.a. Notifier) Plugins? I'm not fond of them, because I do not want Jasper to suddenly start talking without me requesting it. That could be annoying and/or cause heart attacks. IMHO stuff should be just checked on demand, not as a background process. What do you think?
Every plugin should have a boolean enabled config option. That's not too hard, but's still missing. EDIT: Done
Other stuff that I probably forgot.

Holzhaus commented 9 years ago

Another thing just came to my mind. I'd like to put individual plugin initializiation into the plugin's activate() method, e.g. SpeechHandlerPlugins should register their phrases there, and also STTPlugins should compile their vocabularies inside this method, etc. This raises 2 issues:

Plugins would have to be activated in a particular order, i.e SpeechHandlerPlugins will have to be activated before STTPlugins (because they need the CommandPhrases to be registered to be able to compile their vocabularies).
I really do not know how pass the phrases over to the STTPlugins in order to compile the vocabularies. We could either add an additional method compile_vocabulary to the STTPlugin specs that will be called after activating the SpeechHandlerPlugins, but before activating the STTPlugins - or we add an additional argument to the STTPlugins activate method. Each option is not that clean, but I dislike the latter more than the first. Any thoughts?

shbhrsaha commented 9 years ago

Sorry for the late reply, Jan. Congrats on finishing exams! We're about to start that period on our end.

Your plugin work looks slick. My responses to your plugin issues:

Yes, I think the *.jasperplugin file would be a reasonable place to define a plugin's supported languages
Interesting challenge with en_US vs. en_GB. I think I like your third proposal best-- to require that plugins have a base language .mo file. For plugin developers, that solution helps new devs get "up and running" quickly, while still offering flexibility for those who want it. Do we necessarily need to decide whether en is en_US or en_GB by default? In other words, would associating a default change how Jasper loads the module for anyone?
Yes. In my view too, notifier plugins haven't been popular. I think we can safey drop them

In response to your ideas about SpeechHandlerPlugins/STTPlugins:

Yes, I agree SpeechHandlerPlugins should be processed before STTPlugins.
Of the two solutions you proposed, I too like the first option better. I'm wondering though: why not pass the vocabulary in as an argument to the STTPlugin's activate() method?

jasperproject / jasper-client

Architectural changes for next Jasper version #280