Closed fyliu closed 10 months ago
I've looked into options 1 and 3ii. They are both viable options and I think the fixture option is the better one since we can also use it for testing, although with the django_db decorator will automatically apply all the migrations, so all the options will work in that case. We don't have any non-django_db tests but we might eventually. So the fixture option is still preferable if other factors are equal.
In a call with @ExperimentsInHonesty, I learned about the possibility of generating json from G sheets using an app script. I later tried option 3i. It also works well. The advantage of it is that it may be a more direct path from the initial spreadsheet to the final migration file. The options that dumps data from existing tables have the extra step of first inserting the data into the system through the admin site. That's a potentially step to introduce user errors.
To compare the work involved between from spreadsheet and from table data
both
from sheet
from table data
Now that it's laid out, starting from sheets data has more opportunities for automation, so it feels more future-proof.
The next step will be to evaluate if the option 3i route I've done will be adequate for our needs.
Another possibility
This one imports the data from a CSV file. The code is custom to each model but it can likely be made more generic.
To answer the null issue from 2 comments ago, loading from fixture is meant to not activate the auto timestamp behavior defined in the model. The load from fixture functionality is meant to be for dumping and loading data that was already in the database at one point and not for loading initial data. See this django ticket for details about why they won't change this behavior.
So we now have to choose:
This is documented here and linked from the wiki.
The strategy is to export the JSON from the spreadsheet, convert it into a python script that can insert the data, then create a migration file that will call the script when it's processed.
The advantages of using a python script as opposed to importing the data in JSON form using loaddata:
I finished this and then realized there's a better way to do this. The initial data shouldn't be made into migrations if this project is supposed to be generic. They should be left as individual scripts to be run on initial database setup by us, and customized by other organizations for their needs.
I have a draft PR #141 for this. It's a draft because it's dependent on #140 to be merged.
I modified the PR to not depend on sphinx so we can move this forward.
There's no model created that has a need for initial data. This makes it more difficult to review the database insertion part of #141. I will go implement #24.
Looks like #24 has the same structure as #35 and we don't have a direction for working on #35 yet. So I can't work on #24.
The other thing I can do is pull the SOC_Major
table issue out of the ice box, but I would rather not do that since it's going to be prioritized in the far future.
Blocker: I need to implement a table with initial data, and there's no good one to do. This is blocking on #35.
This issue was implemented by PR #226
Overview
Since some of our database tables need to be pre-populated with data, we need to decide on a way to do it and document it.
Action Items
Resources/Instructions
django-seed only generates random data, so it's not useful to usIn any solution, we need to be able to connect it to django's migration system or in a script so it can be auto-inserted on clean build.
<app>/sql/<app>.sql
<app>/fixtures/<app>.json