MarcusOtter / discord-needle

Needle is a Discord bot that creates Discord threads automatically.
https://needle.gg
GNU Affero General Public License v3.0
203 stars 53 forks source link

💡 Switch to Keyv as storage interface #128

Open n1ckoates opened 2 years ago

n1ckoates commented 2 years ago

Describe the improvement

Needle should switch from JSON to Keyv as its storage interface, to allow more choice for hosters. This is a somewhat big undertaking, as almost every piece of Needle relies on the storage interface somehow.

Keyv requires a separate package to be installed for each storage adapter. While not perfect, I think the best solution is to install all the below packages, in addition to Keyv itself. Otherwise, self-hosters would have to fork Needle just to use a different backend, which defeats one of the main purposes of switching.

In addition to the switch, a migration tool would have to be created. For my own bot, I created a CLI tool that loads each configuration (using the legacy storage adapter), transforms it as necessary, then writes it (using the new storage adapter). I'm not sure if this is the best way to do it, though.

Tasks

Problems this improvement solves

Switching to Keyv would allow Needle to run in environments without persistent storage, such as some cloud providers (eg. Replit and Heroku). Hosters could connect to external databases to eliminate the need for storage to be on the same machine; this would also be useful for scaling across multiple machines.

This also solves some issues with using JSON itself; see the "Sticking with JSON" section below.

Alternative solutions

Prisma

Prisma was discussed as an alternative to Keyv in Needle's Discord. The general consensus was that we didn't need a lot of the complex features that Prisma provides, and that Keyv was far simpler. Additionally, Prisma doesn't support migrations for multiple database backends, which is crucial in allowing hosters to upgrade Needle whilst still using a different backend.

Sticking with JSON

While the risk is low, sticking with JSON presents a risk of data corruption - currently, Needle reads the entire file into memory, edits it, then writes the entire file; doing large operations in this manner is risky, and unnecessary to just update one or two config values. With frequent backups (as the public Needle instance has), this risk drops even lower, but it's almost eliminated by using a proper database.

The obvious downside is choice; Needle is supposed to be self-hostable, but simply doesn't work in a lot of environments.

MarcusOtter commented 2 years ago

Thanks! See this discussion for more context to Nick's proposal.