(EAI-177): Implement FAQ Finder

cbush commented 6 months ago

Jira: https://jira.mongodb.org/browse/EAI-177

Changes

Adds FAQ finding to the scripts package
Using DBSCAN algorithm to cluster similar questions
Script takes optional epsilon argument (otherwise defaults to 0.05). Epsilon is the max distance to form a cluster.
Stores FAQ entries in a 'faq' collection in the database for further processing. Each question and response is stored even though many are similar.

Example FAQ entry document:

Notes

The intention is to run this on cron maybe daily or every five days. Because each run stores the timestamp, we'll be able to look at trends in FAQ over time.

For verified answers, the next step is to script a transfer from the faq collection to a yaml document that can be filled out by a human, per the spec.

mongodben commented 6 months ago

i notice CI not running on this PR. do you know why that might be?

cbush commented 6 months ago

I haven't pushed upstreammmm

mongodb / chatbot

(EAI-177): Implement FAQ Finder #292

Changes

Notes