adambechtold / taste-explorer

1 stars 1 forks source link

Task - Create Clear Workflow to Populate Dev Environment with Example Data #23

Open adambechtold opened 8 months ago

adambechtold commented 8 months ago

Background

We're opening this project up to more collaborators.

Frictions

Goals

adambechtold commented 8 months ago

Approach - Write Instructions for How to Add Users from Scratch

1) Create database 2) Create api key to last.fm and spotify 3) Add users using admin api 4) Run cron jobs to backfill listening history

â–¼ Con - Requires collaborators to have their own last.fm account and spotify developer account â–¼ Con - Complicated and time-consuming

adambechtold commented 8 months ago

Approach - Host a Small, Read-only Database

Create a very small database on AWS with example data and make read-only permissions to it public.

â–² Pro - Very easy for collaborators to get started â–¼ Con - Performance could degrade as more people use it â–¼ Con - Cost - This isn't free

Variant - Create separate database on existing prod infrastructure

â–² Pro - Free â–¼ Con - Use by collaborators could degrade prod's performance

adambechtold commented 8 months ago

Approach - Provide Database Dump(s)

Provide .dump files with example data.

â–² Pro - Collaborators can choose their favorite database platform/approach â–² Pro - Easy to provide various dataset sizes

Variant - Host on GitHub

â–² Pro - Version control â–¼ Con - Some of the dumps could be very large, making the repo unnecessarily large

Variant - Host on S3

â–² Pro - Dumps can be very large â–¼ Con - Version control is harder

adambechtold commented 8 months ago

Approach - Host docker Container Instance

Collaborators could simply pull down this repo, spin it up, and start going.

â–² Pro - Very easy to get started â–¼ Con - More work to create than the .dump option

Variant - Have the container download a database .dump during start up

Host a .dump file on S3. Host a container on AWS container registry. Give the container an entrypoint script that checks if the database is already populated and, if not, downloads the dump and populates the database.

(See chat)

adambechtold commented 8 months ago

Update - .dump files are available on S3

I was able to create dump files and put them on S3 but am running into some issues creating a docker container that can download them and pre-populate the database.

I'll keep working on it, but here are the .dump files in the meantime:

cc: @tusharwebd