OpenVoiceOS / ovos-persona

Persona service
Apache License 2.0
0 stars 1 forks source link

The Persona Sprint #4

Open JarbasAl opened 9 months ago

JarbasAl commented 9 months ago

we hit our stretch goal for persona!

this issue will document the progress

Framework

(during ovos-core 0.0.9 dev cycle)

Why

Voice interface

How

code lives here: https://github.com/OpenVoiceOS/ovos-persona

Session

(during ovos-core 0.0.8 dev cycle)

Why

consider the utterance `"chat with {persona_name}" in a multi user setup, this makes the assistant answer all questions using the requested persona for every individual client

what happens when different users are accessing core? whenever someone changes the persona it changes for everyone!

imagem image: a typical hivemind setup, we want each client to interact with a specific persona

How

make the persona configuration come from the message.context so it is per query, not global, this is defined via a [Session](), in the future this also allows OVOS to become user aware per query

the following properties in OVOSSkill should reflect session data live, so skills need no explicit handling of Sessions at all:

session implementation:

footnote: this is the reason ovos-bus-client was created instead of OVOS still using mycroft-bus-client

Pipeline

(during ovos-core 0.0.8 dev cycle)

Why

consider LLMs and how they interact with skills/user commands, how do we know when to use a skill or when to ask a LLM? Factual info usually comes from the web, "chatbot speech" should come from a persona, do we just hardcode this in the utterance handling logic?

each persona handles questions via a selection of [solver plugins](), which can be directly implemented as a FallbackSkill for example

we want to be able to define where in the intent stage this happens, this also allows the pluginification of the intent systems and eventually even those can be replaced with a LLM if desired, giving a persona bias even to intent selection

imagem image: intent service (in green) should be configurable, globally and per persona

How

Solvers

(during ovos-core 0.0.9 dev cycle)

Why

once added to the pipeline, personas need to be able to answer arbitrary questions, they also need to handle input in multiple languages

persona definitions include a list of solver plugins and respective configs, an utterance is sent to the solvers until one can answer the question.

"persona": {
    "solvers": [
        "ovos-solver-plugin-llamacpp", 
        "ovos-solver-plugin-personagpt",
        "ovos-solver-failure-plugin"
   ],
   "ovos-solver-plugin-llamacpp": {
        "persona": "helpful, creative, clever, and very friendly"
    },
    "ovos-solver-plugin-personagpt":{
        "facts": [
            "i am a quiet engineer.",
            "i'm single and am looking for love."
            "sadly, i don't have any relatable hobbies.",
            "luckily, however, i am tall and athletic."
            "on friday nights, i watch re-runs of the simpsons alone."
       ]
   }
}

How

solver plugins have automatic bidirectional translation so they can understand and answer in any language, even if the implementation is language specific

the persona definition specifies solver configs and the order in which they are tried

solvers of interest:

knowledge base solvers:

chatbot like solvers:

LLM solvers:

Solver documentation

A plugin can define the language it works in, eg, wolfram alpha only accepts english input at the time of this writing

Bidirectional translation will be handled behind the scenes for other languages

Developing a solver:

Plugins are expected to implement the get_xxx methods and leave the user facing equivalents alone

from ovos_plugin_manager.templates.solvers import QuestionSolver

class MySolver(QuestionSolver):
    enable_tx = False  # if True enables bidirectional translation
    priority = 100

    def __init__(self, config=None):
        config = config or {}
         # set the "internal" language, defined by dev, not user
         # this plugin internally only accepts and outputs english
        config["lang"] = "en"
        super().__init__(config)

    # expected solver methods to be implemented
    def get_data(self, query, context):
        """
        query assured to be in self.default_lang
        return a dict response
        """
        return {"error": "404 answer not found"}

    def get_image(self, query, context=None):
        """
        query assured to be in self.default_lang
        return path/url to a single image to acompany spoken_answer
        """
        return "http://stock.image.jpg"

    def get_spoken_answer(self, query, context=None):
        """
        query assured to be in self.default_lang
        return a single sentence text response
        """
        return "The full answer is XXX"

    def get_expanded_answer(self, query, context=None):
        """
        query assured to be in self.default_lang
        return a list of ordered steps to expand the answer, eg, "tell me more"

        {
            "title": "optional",
            "summary": "speak this",
            "img": "optional/path/or/url
        }
        :return:
        """
        steps = [
            {"title": "the question", "summary": "we forgot the question", "image": "404.jpg"},
            {"title": "the answer", "summary": "but the answer is 42", "image": "42.jpg"}
        ]
        return steps

Using a solver:

solvers work with any language as long as you stick to the officially supported wrapper methods

    # user facing methods, user should only be calling these
    def search(self, query, context=None, lang=None):
        """
        cache and auto translate query if needed
        returns translated response from self.get_data
        """

    def visual_answer(self, query, context=None, lang=None):
        """
        cache and auto translate query if needed
        returns image that answers query
        """

    def spoken_answer(self, query, context=None, lang=None):
        """
        cache and auto translate query if needed
        returns chunked and translated response from self.get_spoken_answer
        """

    def long_answer(self, query, context=None, lang=None):
        """
        return a list of ordered steps to expand the answer, eg, "tell me more"
        translated response from self.get_expanded_answer
        {
            "title": "optional",
            "summary": "speak this",
            "img": "optional/path/or/url
        }
        :return:
        """

Example Usage - DuckDuckGo plugin

single answer

from skill_ovos_ddg import DuckDuckGoSolver

d = DuckDuckGoSolver()

query = "who is Isaac Newton"

# full answer
ans = d.get_spoken_answer(query)
print(ans)
# Sir Isaac Newton was an English mathematician, physicist, astronomer, alchemist, theologian, and author widely recognised as one of the greatest mathematicians and physicists of all time and among the most influential scientists. He was a key figure in the philosophical revolution known as the Enlightenment. His book Philosophiæ Naturalis Principia Mathematica, first published in 1687, established classical mechanics. Newton also made seminal contributions to optics, and shares credit with German mathematician Gottfried Wilhelm Leibniz for developing infinitesimal calculus. In the Principia, Newton formulated the laws of motion and universal gravitation that formed the dominant scientific viewpoint until it was superseded by the theory of relativity.

chunked answer, for conversational dialogs, ie "tell me more"

from skill_ovos_ddg import DuckDuckGoSolver

d = DuckDuckGoSolver()

query = "who is Isaac Newton"

# chunked answer
for sentence in d.get_long_answer(query):
    print(sentence["title"])
    print(sentence["summary"])
    print(sentence.get("img"))

    # who is Isaac Newton
    # Sir Isaac Newton was an English mathematician, physicist, astronomer, alchemist, theologian, and author widely recognised as one of the greatest mathematicians and physicists of all time and among the most influential scientists.
    # https://duckduckgo.com/i/ea7be744.jpg

    # who is Isaac Newton
    # He was a key figure in the philosophical revolution known as the Enlightenment.
    # https://duckduckgo.com/i/ea7be744.jpg

    # who is Isaac Newton
    # His book Philosophiæ Naturalis Principia Mathematica, first published in 1687, established classical mechanics.
    # https://duckduckgo.com/i/ea7be744.jpg

    # who is Isaac Newton
    # Newton also made seminal contributions to optics, and shares credit with German mathematician Gottfried Wilhelm Leibniz for developing infinitesimal calculus.
    # https://duckduckgo.com/i/ea7be744.jpg

    # who is Isaac Newton
    # In the Principia, Newton formulated the laws of motion and universal gravitation that formed the dominant scientific viewpoint until it was superseded by the theory of relativity.
    # https://duckduckgo.com/i/ea7be744.jpg

Auto translation, pass user language in context

from skill_ovos_ddg import DuckDuckGoSolver

d = DuckDuckGoSolver()

# bidirectional auto translate by passing lang context
sentence = d.get_spoken_answer("Quem é Isaac Newton",
                           context={"lang": "pt"})
print(sentence)
# Sir Isaac Newton foi um matemático inglês, físico, astrônomo, alquimista, teólogo e autor amplamente reconhecido como um dos maiores matemáticos e físicos de todos os tempos e entre os cientistas mais influentes. Ele era uma figura chave na revolução filosófica conhecida como o Iluminismo. Seu livro Philosophiæ Naturalis Principia Mathematica, publicado pela primeira vez em 1687, estabeleceu a mecânica clássica. Newton também fez contribuições seminais para a óptica, e compartilha crédito com o matemático alemão Gottfried Wilhelm Leibniz para desenvolver cálculo infinitesimal. No Principia, Newton formulou as leis do movimento e da gravitação universal que formaram o ponto de vista científico dominante até ser superado pela teoria da relatividade

Server

(during ovos-core 0.0.9 dev cycle)

Why

some personas won't be able to run fully on device due to hardware constraints, LLMs in particular

How

Marketplace

(during ovos-core 0.0.9 dev cycle)

Why

users should be able to one click select a persona and have a nice UI

users should have a way to evaluate how useful each persona is before installing it

How

Skill Dialogs

(during ovos-core 0.0.9 dev cycle)

Why

Skills with personalities and flexible dialogs!

How

New file format, .jsonl

jsonl format info: https://jsonlines.org/

{"utterance": "stick the head out of the window and check it yourself", "attitude": "mean", "weight": 0.1}
{"utterance": "current weather is X", "attitude": "helpful", "weight": 0.9}
"persona": {
    "gender": "male",
    "attitudes": {
        "normal": 100,
        "funny": 70,
        "sarcastic": 10,
        "irritable": 0
    }
}

original issue https://github.com/OpenVoiceOS/OVOS-workshop/issues/56

Dialog and TTS Transformers

Why

skills won't cover every personality, the previous technique works to change the dialog content, but as a persona we also want to change the dialog style

a persona should be able to mutate the text before TTS, and also to modify the audio after TTS

utt = "Quantum mechanics is a branch of physics that describes the behavior of particles at the smallest scales. " \
    "It involves principles such as superposition, where particles can exist in multiple states simultaneously, " \
    "and entanglement, where particles become connected and can influence each other's properties."
print(lovecraftify(utt))
# Quantum mechanics unveils the eldritch secrets of the infinitesimal realm,
# where particles, ensnared in the web of superposition, dwell in manifold states.
# Through the dread phenomenon of entanglement, these entities intertwine,
# their very essence entwined, shaping the fabric of reality.
print(dudeify(utt))
# Quantum mechanics is like, the raddest branch of physics, dude.
# It's all about particles at the tiniest scales, doing crazy stuff like
# being in multiple states at once (superposition) and getting all connected and influencing each other (entanglement).
print(eli5(utt))
# Quantum mechanics is like a special kind of science that helps us understand really tiny things.
# It tells us that these tiny things can be in more than one place at the same time,
# and they can also be connected to each other and affect each other's behavior.

How

under persona json

{
    "dialog_transformers": {
        "ovos-dialog-transformer-openai": {
            "key": "xxxxx",  
            "api_url": "https://api.openai.com/v1"},
            "rewrite_prompt": "Add more 'dude'ness to"
    },
   "tts_transformers": {
        "ovos-tts-transformer-sox": {
            "default_effects": {
                "pitch": {"n_semitones": int}
            }
        }
    }
}
mikejgray commented 9 months ago

It would be very cool to have support for TTS per persona, as well as wakeword per persona.