TheExGenesis / community-archive

An open tweet database and API anyone can build on.
https://www.community-archive.org
MIT License
55 stars 8 forks source link

Seed database with data from the downloaded archives #159

Closed ri72miieop closed 2 weeks ago

ri72miieop commented 4 weeks ago

I removed the previous script to import from the github releases data.

New method uses data downloaded using scripts\download_storage_anon.mts. There are some issues with importing some records (which can be verified using pnpm dev:validateimport), but from what I've seen it's <0.1% of records, so this should not have a major impact in dev scenarios.

vercel[bot] commented 4 weeks ago

@ri72miieop is attempting to deploy a commit to the theexgenesis' projects Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] commented 3 weeks ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
community-archive ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 12, 2024 3:43pm
ri72miieop commented 3 weeks ago

created new migration to grant service_role permissions for private schema and private.job_queue table, this PR now fixes #156

ri72miieop commented 3 weeks ago

these changes solve issue #160 the user can download data with pnpm dev:downloadarchive and then pnpm dev:importfiles to import them into the db. Environment variable ARCHIVE_PATH needs to be set to the root path where the archive was downloaded.

TheExGenesis commented 2 weeks ago

@ri72miieop remember to run pnpm build to make sure deployment will build

image

ri72miieop commented 2 weeks ago

my bad, I will look into how to setup a pre-push hook so it double-checks before pushing (or checks it if I forget!)

I fixed the errors now.