[AI] Identify similar BertTopics

To handle cases where new topics from a message are similar to existing ones in the channel without creating duplicates, we can use a topic similarity threshold to decide if the new topic should merge with an existing topic or be created as a new one. Here’s a proposed approach:

Proposed Steps

Compute Topic Similarity:
- When BERT identifies a new topic in a message, compare this new topic’s semantic_vector with each existing topic in the channel’s ASSOCIATED_WITH relationships.
- Use a similarity metric, such as cosine similarity, between the new topic’s vector and each existing topic’s vector.
Set a Similarity Threshold:
- Define a similarity threshold, e.g., 0.8, above which the new topic is considered “similar enough” to an existing topic. This threshold can be adjusted based on testing.
Merge or Create Logic:
- If Similarity is Above Threshold:
  - Merge the new topic with the existing topic that has the highest similarity score.
  - Update the existing topic’s overall_score using the amplify_score function based on the relevance of the new topic in the message.
- If Similarity is Below Threshold for All Existing Topics:
  - Treat the new topic as distinct, create a new Topic node, and establish the ASSOCIATED_WITH relationship for tracking in this channel.
Optional: Store Relatedness Data:
- For transparency and future adjustments, record similarity data in the RELATED_TO relationship between topics. This way, if similar topics keep emerging, you can track these relationships for potential reorganization or clustering later.

Example Flow:

Analyze New Message:
- A new topic appears in the message with a semantic_vector.
Similarity Comparison:
- Compute cosine similarity between this new topic’s semantic_vector and each existing topic in the channel.
Apply Threshold Decision:
- Above Threshold (e.g., 0.8): Update the most similar existing topic’s score using amplify_score.
- Below Threshold: Create a new topic entry and start tracking it as a distinct topic.

BoredLabsHQ / Concord

[AI] Identify similar BertTopics #29

Proposed Steps

Example Flow: