Closed DefenderOfBasic closed 2 months ago
easiest way might be a straight up dump of schema + data: https://supabase.com/docs/reference/cli/supabase-db-dump
and then we can have some little scripts to post-process this, or like just examples to query basic things. Like if someone just knew how to write python or JS, they could take this function that lists all tweets from the DB dump and do whatever they want with that?
yup I feel like this would dovetail well with setting up the local dev environment bc you need it to do the dump
https://github.com/open-birdsite-db/open-birdsite-db/issues/26
I also wonder if it makes sense to do that process every time someone requests it, or if there are more cost effective ways of serving massive files like that, and we should have some service that dumps and uploads to such a service
Here's a supabase doc I found about automating a backup with github actions. I wonder if the most sustainable thing is, weekly github action + pushing it to an S3 bucket or something?
That seems right. Could have it in supabase storage too. It just seems wild to have the archives mirrored 3 times (tables, json, full dump), but I'm ok with dumb solutions
oh I wonder if the github action will be okay with running for hours and holding GBs of data lol
I kind of want to try cloudflare's R2 storage (exactly like S3 but cheaper). They have a very generous free tier:
https://developers.cloudflare.com/r2/pricing/
(I want to make it so anyone can download this without us having to pay a lot for it)
I mean we're paying for supabase and already using supabase storage but I see R2 is free
last thing we need to figure out is automating the dumping and upload
I want to run some silly analytics (like, what are the top 10 most used words or phrases in my archive, sort everyone else by who has the most of these phrases, help me find my soul mates)
I want to run this over all tweets without breaking the bank/spamming the supabase DB with requests. What would be the best way to do this/allow others to also do this? It'd be nice if we had like, a weekly snapshot or something that can be downloaded offline?