scaife-viewer / sv-mini-atlas

ATLAS implementation for the Scaife "SV Mini" prototype
https://scaife-viewer.org/
MIT License
1 stars 1 forks source link

Speed up ingestion using bulk insert / updates #10

Closed jacobwegner closed 4 years ago

jacobwegner commented 4 years ago

Ingestion with django-treebeard is currently pretty slow, because there are lot of INSERT and UPDATE statements happening.

I took a couple passes at the ingestion scripts today and have some promising early results; more to test and document, but I see the performance improvements below:

On master:

time python manage.py prepare_db
<snip>
--[Loading versions]--
Iliad: 15710 nodes.
Odyssey: 12132 nodes.
Crito: 483 nodes.
Enchiridion: 157 nodes.
The Epistle to Diognetus: 113 nodes.
The Epistle of Barnabas: 218 nodes.
The First Epistle of Clement: 469 nodes.
The Second Epistle of Clement: 139 nodes.
The Didache: 119 nodes.
The Shepherd of Hermas: 923 nodes.
Ignatius to the Ephesians: 71 nodes.
Ignatius to the Magnesians: 43 nodes.
Ignatius to the Trallians: 45 nodes.
Ignatius to the Romans: 41 nodes.
Ignatius to the Philadelphians: 37 nodes.
Ignatius to the Smyrnaeans: 44 nodes.
Ignatius to Polycarp: 32 nodes.
The Martyrdom of Polycarp: 85 nodes.
Polycarp to the Philippians: 54 nodes.
30947 total nodes on the tree.
python manage.py prepare_db  48.03s user 26.32s system 77% cpu 1:35.94 total

On spike/speedy-ingestion

time python manage.py prepare_db
<snip>
--[Loading versions]--
Iliad: 15710 nodes.
Odyssey: 12132 nodes.
Crito: 483 nodes.
Enchiridion: 157 nodes.
The Epistle to Diognetus: 113 nodes.
The Epistle of Barnabas: 218 nodes.
The First Epistle of Clement: 469 nodes.
The Second Epistle of Clement: 139 nodes.
The Didache: 119 nodes.
The Shepherd of Hermas: 923 nodes.
Ignatius to the Ephesians: 71 nodes.
Ignatius to the Magnesians: 43 nodes.
Ignatius to the Trallians: 45 nodes.
Ignatius to the Romans: 41 nodes.
Ignatius to the Philadelphians: 37 nodes.
Ignatius to the Smyrnaeans: 44 nodes.
Ignatius to Polycarp: 32 nodes.
The Martyrdom of Polycarp: 85 nodes.
Polycarp to the Philippians: 54 nodes.
30947 total nodes on the tree.
python manage.py prepare_db  5.04s user 0.33s system 90% cpu 5.935 total

Refs Improve ingestion speed

TODOS:

jacobwegner commented 4 years ago

Deployed to https://mini-stack-a-spike-spee-wkno11.herokuapp.com/graphql/

jacobwegner commented 4 years ago

Improve ingestion speed

jacobwegner commented 4 years ago

@jhrr thanks for your work here; I want to add some docstrings but then will flag for a final review.

I just pushed a small change set where I removed some dead code and added simple assertions for numchild counts.