geeksforsocialchange / PlaceCal

Bring your community together
https://placecal.org
GNU Affero General Public License v3.0
16 stars 6 forks source link

Define seeding parameters #1946

Closed ivan-kocienski-gfsc closed 1 year ago

ivan-kocienski-gfsc commented 1 year ago

Description

"Seeding" is the act of populating an empty database with sample records so local developers have something to work with on their local machines.

But there are many considerations to take into account that have impacts on later work. Like- the whole problem should be looked at from a high level.

Put together a document that defines the scope of seeding. Considering various approaches and their pros/cons. Have a discussion or agreement about the options available and the solution selected.

Expected output

A document explaining the choices available, the option selected, and its utility and limitations.

Acceptance Criteria

ivan-kocienski-gfsc commented 1 year ago

Developer onboarding notes

How to seed the database from empty.

The ultimate problem here is that if we allow for external services to be called then we will have no control over their responses (unless we run our own local copies) or we can stub the external calls but we must be vigelant to keep our stubs in step with how the services actually work.

Okay. so the challenge of PC is that so much of the service is dependent on external parties.

The only things we have that don't touch the outside world are

But external factors come in with

Potential solutions

1. import from production snapshot.

Pros:

Cons:

2. Pure artificial fake data

fake data where we populate all the tables with hard-coded values explicitly setting up scenarios bypass the need to call out on any service. it's just poked in by hand. (think its like a test factory).

Pros:

Cons:

3. Limited subset of production data

fake data where we build a subset of live production data.

Pros:

Cons:

Choice

For the purpose of seeding the database task I will pick option 3 as it will isolate us from external services but will still capture something of production. I would also say that option 1 should be an established procedure as there are many times when checking against live data is the only way to replicate problems.

Next steps

  1. draw up the data that needs to be present post-run.
  2. select information that can be imported (from production)
  3. put together code to safely and cleanly import data in a replicable way
katjam commented 1 year ago

Closing as report complete.