Simibrum / therapy_bot

Proof of concept for a GPT-4 powered chatbot. Exploring how GPT models work with mental health dialogues and playing with chat memory and user profiling.
GNU General Public License v3.0
1 stars 0 forks source link

Parse Chat Messages to Build Social Graph #10

Open benhoyle opened 7 months ago

benhoyle commented 7 months ago

The idea is we use spaCy to parse chat messages as they are received.

We can extract named entitys and noun chunks from the chat messages. We can then use these to build a graph representation of the user's social environment (friends and family). The graph can then be retrieved and used as a chat context.

Our general graph model has Nodes and Edges. Nodes could have the following types:

There will also be a time element. For example, the nodes in the user's life will change over time, e.g. marriage/re-marriage, death, birth.

To do:

benhoyle commented 4 months ago

We want to work on the chat text. But the parsing methods should be relatively independent of the chat creation logic.

We have a ChatReference model in models.chats.

image

I think the aim was to use this to model each of the three different types of knowledge - event, person, and place.

But it's a bit confusing coming back to it and forgetting what it is meant to be doing.

benhoyle commented 4 months ago

What is the non-confusing way?

What's the doc index?

Is it the event/person/place that needs the chat reference or the graph Node?

At the moment we use HasChatReferences as a mixin on the individual Event, Person, and Place models.

One issue is we need a many-many relationship between Node and Chat - a node can be linked with many chats and each chat can be linked with many nodes.

benhoyle commented 4 months ago

A decent write-up of some explorations into graphs

benhoyle commented 4 months ago

Parsing Function

So back to the parsing function.

Ideally we want something independent of our data model. So the core logic would be:

Then we want to parse with an LLM to complement the ent extraction (e.g., non-named entities).

Then we want to reconcile with existing nodes and edges in the DB.

Generating a Knowledge Graph with Spacy

benhoyle commented 4 months ago

GPT4-turbo is actually pretty good at this at a sentence level: image

benhoyle commented 4 months ago

You counter-intuitively get better performance when you ask for the nodes and edges at the same time?

image

No you get the same variation with nodes and edges... image

benhoyle commented 4 months ago

Interestingly - it doesn't get the two "Karens" if we just give it the list of tokens and indexes: image

benhoyle commented 4 months ago

But doesn't get all the "he"s etc out of the box: image

benhoyle commented 4 months ago

gpt4-o is quite good at this: image

benhoyle commented 4 months ago

To do:

benhoyle commented 4 months ago

Observations