paulbricman / dual-obsidian-client

A skilled virtual assistant for Obsidian.
https://paulbricman.com/thoughtware/dual
Mozilla Public License 2.0
242 stars 7 forks source link

Proposal for functionality changes and recipe framework design #45

Closed paulbricman closed 3 years ago

paulbricman commented 3 years ago

Progress towards solving existing issues and setting up a proper roadmap had been slowed in the past days by the fear of prematurely settling on an architecture and API design given that this space of conversational interfaces over personal knowledge bases is quite unexplored.

The following describes a suggestion for heavily restructuring the functionality and the codebase, a tentative something in between a spec and a user story.

Architecture

Dual is based on two components: the backend and the frontend. The backend is a server which exposes two main endpoints:

However, the user doesn't usually interact with the endpoints directly. Rather, they use recipes. Recipes tell Dual how to answer certain commands. They can be predefined, user defined, or contributed by some other user. Recipes are simple Markdown files with the following structure:

---
tags: "#dualrecipe"
pattern: "What is the answer to the ultimate question of life, the universe, and everything?"
---

42, naturally.

If the user has this recipe in their vault as a note, then whenever they ask their Dual that question, they'll get the contents of the note as an answer.

The pattern field of a recipe is a regex pattern. It can also house groups, which can then be referenced in the content.

---
tags: "#dualrecipe"
pattern: "My name is (.*)"
---

Hi there, \1!

With this recipe, if the user tells their Dual My name is John, it'll reply with Hi there, John!.

All this is cute, but not all that useful or interesting. Among the recipes there's also this predefined recipe:

---
tags: "#dualrecipe"
pattern: "Find a note which (.*)"
---

'''dual
GET "/extract/This text \1"
'''

Now, this is good old descriptive search, expressed as a recipe which makes use of the /extract endpoint. When asking Find a note which describes a metaphor between machine learning and sociology, it'll answer with a list of results based on that GET HTTP call made behind the scene to the endpoint.

But if you wanted to customize the command triggers even for this predefined command, you could just wrap a new recipe around it, or change the original one. Here's a wrapper recipe:

---
tags: "#dualrecipe"
pattern: "Yo show me a thing which (.*)"
---

Here ya go:

'''dual
ASK "Find a note which \1"
'''

Cool, you just made your Dual a bit edgier.

So this is how you can express good old descriptive search and fluid search as recipes. What about good old open dialogue?

---
tags: "#dualrecipe"
pattern: "^(([Ww]hy|[Ww]hat|[Ww]hen|[Ww]here|[Ww]ho|[Hh]ow).*)"
---

'''dual
GET "/extract/This text is about \1"
'''

Q: \1
A:

'''dual
GET "/generate/"
'''

Now, when you ask it a question with that structure, Dual assembles the relevant notes in there, composes the prompt further with your query, and then generates the response. Good old open dialogue, but expressed as a recipe. Every command becomes a customizable recipe.

Now you want to teach your Dual to come up with writing prompts, you create this recipe:

---
tags: "#dualrecipe"
pattern: "^[Cc]ome up with a writing prompt\.?"
---

prompt: A sentient being has landed on your planet and your civilization's military has confronted it at the landing site of its ship. You are sent closer as a mediator and encounter a mass of energy that has no form but communicates with you in your language.

prompt: Your spaceship has landed on an unknown planet and there is data showing lifeforms who have created artistic structures. There is an artist in your group who wants to make first contact with the beings through art.

prompt: We discover that beneath its seemingly uninhabitable appearance, Mars has an entire race of subterranean alien lifeforms living on it. You are part of the team sent to explore this civilization.

prompt: 

'''dual
GET "/generate/"
'''

You ask it Come up with a writing prompt and you get some in return.

Sure, there are technicalities. The note contents until the generate call should be piped into it as the prompt. The endpoints are shorthand for localhost:5000/..., but you could perhaps change them to refer to a hosted instance at some point in the future. You could make calls to other people's instances through recipes. You could tap into any API through a recipe, turning Dual in a sort of conversational hub. Regex groups have to be entered when making calls. URL's have to be encoded properly because they contain text. Extract calls should know if to supply filenames or contents, through parameters probably. What should a recipe return, the entire contents or the result of the last call? Perhaps a metadata setting. A bunch of things still to settle on.

paulbricman commented 3 years ago

An incremental update of the idea above.

  1. The #dualrecipe tag can be omitted, all recipes could be placed in a recipe folder to avoid this overhead.
  2. pattern could be replaced with input, as it defines the signature of the recipe as a function, how information gets in.
  3. Additionally, to better reflect what gets out of a recipe, an output front matter field could also be provided.
  4. Just like \1 points to regex group 1 from the input command, so should #1 point to the output of code block 1 etc. #0 should stand for the entire note once all blocks have been resolved (up to a point).
  5. \0 could stand for the contents of the conversation before the call of the recipe. So backslash numbers refer to user input and the conversation, while tag numbers refer to block output and the note.
  6. The Dual code block only knows how to call another recipe.
  7. However, the recipe interpreter also supports javascript and python code blocks, interpreted on the fly. Their outputs can similarly be references with a tag number reference. Heavily inspired by this

With those modifications, a recipe might look something like this:

---
input: "Come up with a writing prompt about (.*)"
output: "Here's a writing prompt about \1: #1"
---

- robots: You live in a world where human beings are forbidden to work. Every job imaginable has been taken over by robots, even flying airplanes and writing books. You are not allowed to pursue any work-related tasks, just as robots cannot have a life of their own. When both sides realize they want to make a change, they rise up together and rebel against their governments to prove the power of the people—and bots.
- space: Interplanetary war has broken out and you're a mercenary who will fight for the highest bidder. Suddenly, news that your childhood sweetheart has been captured and is being tried as a traitor on Earth changes your mind, and you decide to rescue her instead.
- \1: 

'''dual
Generate a paragraph based on #0.
'''

'''js
alert("This is a useless pop-up window containing a writing prompt about" + " \1: " + "#1")
'''

Low-level recipes elegantly use js blocks to actually fetch stuff from the local server.

paulbricman commented 3 years ago

@stuhlmueller highlighted the contrast between the rigor and rigidity of regular expression and the flexibility and fuzziness of many instances natural language processing being combined in an awkward way. With regex input patterns for recipes, if you miss even one character, the recipe wouldn't trigger. Suggesting related (regex-based) recipes based on semantic similarity between the tentative command and the one that would work might be a way to circumvent this issue, but a more interesting approach might be to make away with regex entirely.

Here's a new design portraying the same recipe as above:

---
examples:
  - "Come up with a writing prompt about aliens."
  - "Suggest me a writing prompt about space!"
  - "Tell me what would be a nice writing prompt about robots."
output: "Here's a writing prompt about *topic*: #1"
---

- robots: You live in a world where human beings are forbidden to work. Every job imaginable has been taken over by robots, even flying airplanes and writing books. You are not allowed to pursue any work-related tasks, just as robots cannot have a life of their own. When both sides realize they want to make a change, they rise up together and rebel against their governments to prove the power of the people—and bots.
- space: Interplanetary war has broken out and you're a mercenary who will fight for the highest bidder. Suddenly, news that your childhood sweetheart has been captured and is being tried as a traitor on Earth changes your mind, and you decide to rescue her instead.
- *topic*:

'''dual
Generate a paragraph based on #0.
'''

'''js
alert("This is a useless pop-up window containing a writing prompt about" + " *topic*: " + "#1")
'''

The strings in the examples field are used to route new commands to the most suitable recipes through an aggregate semantic distance. Alternatively, a description could also be provided and used for finding the best recipes for a given command through something similar to descriptive search (This command requests that one "comes up with a writing prompt"., where the "..." part would be a description of the recipe). But the fuzzy matching based on examples seems the most straightforward way of relaying commands to recipes without regex.

The string *topic* is not used here as a stand-in placeholder for something I was lazy to exemplify. The author of the recipe would explicitly write *topic* in their recipe, and this "variable" or "argument" would automatically be derived from the command, in a fuzzy and flexible way, without a rigid regex structure, using text generation. In other recipes, one could try to use *description*, *person*, *discipline* or whatever is known to be specified in the command, one way or another.

image

This would enable a truly flexible framework for defining commands, by preserving the ability to pick up "arguments" from the commands with decent accuracy while not requiring a strict regex match.