Open Marigold opened 1 month ago
On the cost side, I don't see the cost increasing that much. Even if it was 10x, it'd still cost less than 5$ per day. But maybe I'm missing something, and it could be a larger increase?
But maybe I'm missing something, and it could be a larger increase?
I have no idea to be honest, I was just surprised that just github PRs cost $0.35. I expect that Slack contains much more information (that doesn't have to translate into size though).
@Marigold From what I quickly inspected, the current system prompt includes much of our documentation (e.g. docs for Table
, Variable
, Dataset
, etc. objects). Haven't checked the exact number of tokens, but I could imagine it being 95% documentation + 5% GitHub PRs. Knowing this, I think that the additional cost of Slack messages wouldn't be that high.
Could be a good cooldown project, or something for over Xmas.
The News app in Wizard provides a summary of key events in the ETL repository over the past 7 days. This summary is generated from all PRs stored in MySQL.
It would be interesting to apply this concept to all communication in Slack and see if we can generate useful summaries. For example, we could track dataset updates, article or insight changes, projects, bugs, and more. Since many of our apps already send data to Slack (e.g., GitHub updates, bug reports), Slack essentially functions as our communication "data warehouse."
Risks
This could become pretty expensive. We pay $0.35 for News summary which includes just Github activity. We should think carefully about what to include there and what not.
If we store Slack data in MySQL, we have to make sure it's excluded from Datasette public.