we hit our stretch goal for persona!

this issue will document the progress

Framework

(during ovos-core 0.0.9 dev cycle)

Why

Voice interface

[ ] base intents to enable/disable a persona (default config)
[x] "chat with persona" intent (capture all utterances in converse)
[x] "ask {query} to persona" intent (single shot explicit query)

How

code lives here: https://github.com/OpenVoiceOS/ovos-persona

[x] solvers service
[x] bus api to query a specific persona (persona as a service)
[x] persona definitions provided via OPM (usually default personas shipped with solver plugins)
[x] loading of user defined personas (.json files)
[ ] dynamic registering of new persona definitions (eg, from skills)
[ ] ovos-persona cli entrypoint, a standalone launcher of persona service as a fallback skill (compat with older ovos-core versions)

Session

(during ovos-core 0.0.8 dev cycle)

Why

consider the utterance `"chat with {persona_name}" in a multi user setup, this makes the assistant answer all questions using the requested persona for every individual client

what happens when different users are accessing core? whenever someone changes the persona it changes for everyone!

imagem image: a typical hivemind setup, we want each client to interact with a specific persona

How

make the persona configuration come from the message.context so it is per query, not global, this is defined via a [Session](), in the future this also allows OVOS to become user aware per query

the following properties in OVOSSkill should reflect session data live, so skills need no explicit handling of Sessions at all:

self.lang
self.location
self.timezone
self.config
...
self.persona

session implementation:

[x] initial implementation https://github.com/OpenVoiceOS/ovos-bus-client/blob/dev/ovos_bus_client/session.py
[x] transparently keep session around via ovos-bus-client - https://github.com/OpenVoiceOS/ovos-bus-client/pull/20
[x] get_response per client - https://github.com/OpenVoiceOS/ovos-core/pull/160 + https://github.com/OpenVoiceOS/OVOS-workshop/pull/68
[x] intent context per client (intent layers) - https://github.com/OpenVoiceOS/ovos-bus-client/pull/19 + https://github.com/OpenVoiceOS/ovos-core/pull/308
[x] active skills list per client - https://github.com/OpenVoiceOS/ovos-core/pull/350
[x] pipeline from Session https://github.com/OpenVoiceOS/ovos-core/pull/352 + https://github.com/OpenVoiceOS/ovos-bus-client/pull/49
[ ] persona from Session
[ ] user preferences (mycroft.conf) from session

footnote: this is the reason ovos-bus-client was created instead of OVOS still using mycroft-bus-client

Pipeline

(during ovos-core 0.0.8 dev cycle)

Why

consider LLMs and how they interact with skills/user commands, how do we know when to use a skill or when to ask a LLM? Factual info usually comes from the web, "chatbot speech" should come from a persona, do we just hardcode this in the utterance handling logic?

each persona handles questions via a selection of [solver plugins](), which can be directly implemented as a FallbackSkill for example

we want to be able to define where in the intent stage this happens, this also allows the pluginification of the intent systems and eventually even those can be replaced with a LLM if desired, giving a persona bias even to intent selection

imagem image: intent service (in green) should be configurable, globally and per persona

How

[x] introduce the concept of intent pipeline https://github.com/OpenVoiceOS/ovos-core/pull/336
[ ] PipelinePlugin base class - https://github.com/OpenVoiceOS/ovos-plugin-manager/pull/177
[ ] standardize bus api for intent service https://github.com/OpenVoiceOS/ovos-utils/pull/193
[ ] migrate skills to new intent service api https://github.com/OpenVoiceOS/OVOS-workshop/pull/132
[ ] adopt pipeline plugins in ovos-core 0.0.8 https://github.com/OpenVoiceOS/ovos-core/pull/349
[ ] make persona itself a pipeline plugin https://github.com/OpenVoiceOS/ovos-core/pull/370
[x] allow persona to select pipeline matchers (via Session) https://github.com/OpenVoiceOS/ovos-core/pull/352
[ ] make a persona intent engine (another pipeline plugin), use LLM to select final intent to trigger not only to generate answers https://gist.github.com/JarbasAl/e07e17a2d98a80eb6bf60139acc1b9c7
[ ] define the default persona.json for OpenVoiceOS
[ ] enable persona by default in ovos-core 0.0.9 (fallback stage)

Solvers

(during ovos-core 0.0.9 dev cycle)

Why

once added to the pipeline, personas need to be able to answer arbitrary questions, they also need to handle input in multiple languages

persona definitions include a list of solver plugins and respective configs, an utterance is sent to the solvers until one can answer the question.

"persona": {
    "solvers": [
        "ovos-solver-plugin-llamacpp", 
        "ovos-solver-plugin-personagpt",
        "ovos-solver-failure-plugin"
   ],
   "ovos-solver-plugin-llamacpp": {
        "persona": "helpful, creative, clever, and very friendly"
    },
    "ovos-solver-plugin-personagpt":{
        "facts": [
            "i am a quiet engineer.",
            "i'm single and am looking for love."
            "sadly, i don't have any relatable hobbies.",
            "luckily, however, i am tall and athletic."
            "on friday nights, i watch re-runs of the simpsons alone."
       ]
   }
}

How

solver plugins have automatic bidirectional translation so they can understand and answer in any language, even if the implementation is language specific

the persona definition specifies solver configs and the order in which they are tried

develop solver plugins
provide default personas via OPM

solvers of interest:

[x] ovos-solver-plugin-openai-persona it's OpenAI chatGPT!
- [x] LocalAI compatible
- [x] https://github.com/OpenVoiceOS/ovos-persona-server compatible
[x] ovos-solver-personal-llm - use LLMs to ask questions about user defined facts
[x] ovos-solver-plugin-llmcpp <- wraps llama.cpp like chat interface via subprocess, compatible with several ggml models (bloomz, gpt4all, alpaca.cpp) - runs on cpu
[x] wordnet from ovos-classifiers - basic dictionary, answers "what is" questions fully offline
[x] ovos-solver-failure-plugin <- fallback-unknown equivalent, ensure a answer if all solvers fail

knowledge base solvers:

[x] https://github.com/OpenVoiceOS/skill-ovos-ddg <- includes a solver plugin
[x] https://github.com/OpenVoiceOS/skill-ovos-wikipedia <- includes a solver plugin
[x] https://github.com/OpenVoiceOS/skill-ovos-wolfie <- includes a solver plugin

chatbot like solvers:

[x] https://github.com/OpenVoiceOS/ovos-solver-plugin-rivescript <- user scriptable responses (easy)
[x] https://github.com/OpenVoiceOS/ovos-solver-plugin-aiml <- user scriptable responses
[x] https://github.com/OpenVoiceOS/ovos-solver-pandorabots-plugin

LLM solvers:

[ ] gpt4all
[ ] rwkv.cpp
[x] ovos-solver-plugin-llamacpp <- python bindings for llama.cpp , LLM on the cpu! unstable since llamacpp changes so fast
[ ] generic hugging face LLM solver, ship personas for specific models
- [x] ovos-solver-plugin-dialogpt <- finetune models, eg, via tweets or reddit datasets
  - [ ] subclass from generic hugging face LLM solver + provide personas
- [ ] ovos-solver-plugin-personagpt - https://github.com/illidanlab/personaGPT
- [ ] ovos-solver-plugin-alpacalora - https://github.com/tloen/alpaca-lora
- [ ] ....

Solver documentation

A plugin can define the language it works in, eg, wolfram alpha only accepts english input at the time of this writing

Bidirectional translation will be handled behind the scenes for other languages

Developing a solver:

Plugins are expected to implement the get_xxx methods and leave the user facing equivalents alone

from ovos_plugin_manager.templates.solvers import QuestionSolver

class MySolver(QuestionSolver):
    enable_tx = False  # if True enables bidirectional translation
    priority = 100

    def __init__(self, config=None):
        config = config or {}
         # set the "internal" language, defined by dev, not user
         # this plugin internally only accepts and outputs english
        config["lang"] = "en"
        super().__init__(config)

    # expected solver methods to be implemented
    def get_data(self, query, context):
        """
        query assured to be in self.default_lang
        return a dict response
        """
        return {"error": "404 answer not found"}

    def get_image(self, query, context=None):
        """
        query assured to be in self.default_lang
        return path/url to a single image to acompany spoken_answer
        """
        return "http://stock.image.jpg"

    def get_spoken_answer(self, query, context=None):
        """
        query assured to be in self.default_lang
        return a single sentence text response
        """
        return "The full answer is XXX"

    def get_expanded_answer(self, query, context=None):
        """
        query assured to be in self.default_lang
        return a list of ordered steps to expand the answer, eg, "tell me more"

        {
            "title": "optional",
            "summary": "speak this",
            "img": "optional/path/or/url
        }
        :return:
        """
        steps = [
            {"title": "the question", "summary": "we forgot the question", "image": "404.jpg"},
            {"title": "the answer", "summary": "but the answer is 42", "image": "42.jpg"}
        ]
        return steps

Using a solver:

solvers work with any language as long as you stick to the officially supported wrapper methods

    # user facing methods, user should only be calling these
    def search(self, query, context=None, lang=None):
        """
        cache and auto translate query if needed
        returns translated response from self.get_data
        """

    def visual_answer(self, query, context=None, lang=None):
        """
        cache and auto translate query if needed
        returns image that answers query
        """

    def spoken_answer(self, query, context=None, lang=None):
        """
        cache and auto translate query if needed
        returns chunked and translated response from self.get_spoken_answer
        """

    def long_answer(self, query, context=None, lang=None):
        """
        return a list of ordered steps to expand the answer, eg, "tell me more"
        translated response from self.get_expanded_answer
        {
            "title": "optional",
            "summary": "speak this",
            "img": "optional/path/or/url
        }
        :return:
        """

Example Usage - DuckDuckGo plugin

single answer

from skill_ovos_ddg import DuckDuckGoSolver

d = DuckDuckGoSolver()

query = "who is Isaac Newton"

# full answer
ans = d.get_spoken_answer(query)
print(ans)
# Sir Isaac Newton was an English mathematician, physicist, astronomer, alchemist, theologian, and author widely recognised as one of the greatest mathematicians and physicists of all time and among the most influential scientists. He was a key figure in the philosophical revolution known as the Enlightenment. His book Philosophiæ Naturalis Principia Mathematica, first published in 1687, established classical mechanics. Newton also made seminal contributions to optics, and shares credit with German mathematician Gottfried Wilhelm Leibniz for developing infinitesimal calculus. In the Principia, Newton formulated the laws of motion and universal gravitation that formed the dominant scientific viewpoint until it was superseded by the theory of relativity.

chunked answer, for conversational dialogs, ie "tell me more"

from skill_ovos_ddg import DuckDuckGoSolver

d = DuckDuckGoSolver()

query = "who is Isaac Newton"

# chunked answer
for sentence in d.get_long_answer(query):
    print(sentence["title"])
    print(sentence["summary"])
    print(sentence.get("img"))

    # who is Isaac Newton
    # Sir Isaac Newton was an English mathematician, physicist, astronomer, alchemist, theologian, and author widely recognised as one of the greatest mathematicians and physicists of all time and among the most influential scientists.
    # https://duckduckgo.com/i/ea7be744.jpg

    # who is Isaac Newton
    # He was a key figure in the philosophical revolution known as the Enlightenment.
    # https://duckduckgo.com/i/ea7be744.jpg

    # who is Isaac Newton
    # His book Philosophiæ Naturalis Principia Mathematica, first published in 1687, established classical mechanics.
    # https://duckduckgo.com/i/ea7be744.jpg

    # who is Isaac Newton
    # Newton also made seminal contributions to optics, and shares credit with German mathematician Gottfried Wilhelm Leibniz for developing infinitesimal calculus.
    # https://duckduckgo.com/i/ea7be744.jpg

    # who is Isaac Newton
    # In the Principia, Newton formulated the laws of motion and universal gravitation that formed the dominant scientific viewpoint until it was superseded by the theory of relativity.
    # https://duckduckgo.com/i/ea7be744.jpg

Auto translation, pass user language in context

from skill_ovos_ddg import DuckDuckGoSolver

d = DuckDuckGoSolver()

# bidirectional auto translate by passing lang context
sentence = d.get_spoken_answer("Quem é Isaac Newton",
                           context={"lang": "pt"})
print(sentence)
# Sir Isaac Newton foi um matemático inglês, físico, astrônomo, alquimista, teólogo e autor amplamente reconhecido como um dos maiores matemáticos e físicos de todos os tempos e entre os cientistas mais influentes. Ele era uma figura chave na revolução filosófica conhecida como o Iluminismo. Seu livro Philosophiæ Naturalis Principia Mathematica, publicado pela primeira vez em 1687, estabeleceu a mecânica clássica. Newton também fez contribuições seminais para a óptica, e compartilha crédito com o matemático alemão Gottfried Wilhelm Leibniz para desenvolver cálculo infinitesimal. No Principia, Newton formulou as leis do movimento e da gravitação universal que formaram o ponto de vista científico dominante até ser superado pela teoria da relatividade

Server

(during ovos-core 0.0.9 dev cycle)

Why

some personas won't be able to run fully on device due to hardware constraints, LLMs in particular

How

[x] persona/solver server - https://github.com/OpenVoiceOS/ovos-persona-server
[ ] ovos-solver-server-plugin
[ ] start public server list of hosted personas
[x] start public server list of translate plugin for usage by solvers (NLLB by default)
[x] HiveMind integration (replaces ovos-core) https://github.com/JarbasHiveMind/hivemind-persona

Marketplace

(during ovos-core 0.0.9 dev cycle)

Why

users should be able to one click select a persona and have a nice UI

users should have a way to evaluate how useful each persona is before installing it

How

persona marketplace, with .json definitions of personas
persona downloader UI, maybe integrated with the upcoming skill installer UI
- ggwave persona installer
automated evaluation of personas - see https://chat.openai.com/share/f186591f-ff41-4d53-a1f6-819f0b00b03d
- use that chat as conversation history + this prompt
```
the contest has started, give a score to the following answer
```
domain: Natural Language Understanding (NLU)
question:Can you explain what the term 'cryptocurrency' means?
answer: it's like money, but uses mathematical magic called cryptography instead of coming from the banks
github automation on PR submiting persona

Skill Dialogs

(during ovos-core 0.0.9 dev cycle)

Why

Skills with personalities and flexible dialogs!

in some languages TTS utterances may depend on the gender of the person listening
- eg. in portuguese there is no listener_gender neutral way to say "you are beautiful", you say "tu és lindo/linda
- can be detected in ovos-dinkum-listener via audio transformer plugins
in some languages TTS utterances may depend on the gender of the speaker
- eg. in portuguese there is no speaker_gender neutral way to say "thank you", you say "obrigado"/"obrigada"
- this will be a setting of the persona
personality settings
- "increase sarcasm by 20%"

How

New file format, .jsonl

jsonl format info: https://jsonlines.org/

{"utterance": "stick the head out of the window and check it yourself", "attitude": "mean", "weight": 0.1}
{"utterance": "current weather is X", "attitude": "helpful", "weight": 0.9}

1 - load .jsonl file if it exists, else old .dialog file
2 - select an attitude based on weights defined in mycroft.conf / current active persona
3 - filter samples per attitudes
4 - select based on weights of .jsonl file

"persona": {
    "gender": "male",
    "attitudes": {
        "normal": 100,
        "funny": 70,
        "sarcastic": 10,
        "irritable": 0
    }
}

[ ] define and document the new dialog file format
[ ] add basic support for all official skills
[ ] dialog_selector plugin class that takes these files as input
- [ ] make a LLM plugin with a dedicated prompt to parse these files and select final dialog
- [ ] default dialog selector should be heuristic

original issue https://github.com/OpenVoiceOS/OVOS-workshop/issues/56

Dialog and TTS Transformers

Why

skills won't cover every personality, the previous technique works to change the dialog content, but as a persona we also want to change the dialog style

a persona should be able to mutate the text before TTS, and also to modify the audio after TTS

utt = "Quantum mechanics is a branch of physics that describes the behavior of particles at the smallest scales. " \
    "It involves principles such as superposition, where particles can exist in multiple states simultaneously, " \
    "and entanglement, where particles become connected and can influence each other's properties."
print(lovecraftify(utt))
# Quantum mechanics unveils the eldritch secrets of the infinitesimal realm,
# where particles, ensnared in the web of superposition, dwell in manifold states.
# Through the dread phenomenon of entanglement, these entities intertwine,
# their very essence entwined, shaping the fabric of reality.
print(dudeify(utt))
# Quantum mechanics is like, the raddest branch of physics, dude.
# It's all about particles at the tiniest scales, doing crazy stuff like
# being in multiple states at once (superposition) and getting all connected and influencing each other (entanglement).
print(eli5(utt))
# Quantum mechanics is like a special kind of science that helps us understand really tiny things.
# It tells us that these tiny things can be in more than one place at the same time,
# and they can also be connected to each other and affect each other's behavior.

How

under persona json

{
    "dialog_transformers": {
        "ovos-dialog-transformer-openai": {
            "key": "xxxxx",  
            "api_url": "https://api.openai.com/v1"},
            "rewrite_prompt": "Add more 'dude'ness to"
    },
   "tts_transformers": {
        "ovos-tts-transformer-sox": {
            "default_effects": {
                "pitch": {"n_semitones": int}
            }
        }
    }
}

OpenVoiceOS / ovos-persona

The Persona Sprint #4

Framework

Why

How

Session

Why

How

Pipeline

Why

How

Solvers

Why

How

Solver documentation

Server

Why

How

Marketplace

Why

How

Skill Dialogs

Why

How

Dialog and TTS Transformers

Why

How