Closed nhoffman closed 1 year ago
version 1.10.0 will contain the following optimizations:
After all of this, some experiments on my laptop (2022 M1 pro)
sqlite
taxit -v new_database -z taxdmp.zip 57.15s user 3.09s system 93% cpu 1:04.41 total
indexes, psycopg2
taxit -v new_database postgresql://$(whoami)@localhost:5432/taxonomy -z 74.32s user 68.86s system 17% cpu 13:55.58 total
indexes, psycopg3
taxit -v new_database postgresql+psycopg://$(whoami)@localhost:5432/taxonomy 98.06s user 36.20s system 21% cpu 10:26.93 total
no indexes, psycopg2
taxit -v new_database postgresql://$(whoami)@localhost:5432/taxonomy -z 74.96s user 61.38s system 25% cpu 8:44.74 total
no indexes, psycopg3
taxit -v new_database postgresql+psycopg://$(whoami)@localhost:5432/taxonomy 77.77s user 58.15s system 38% cpu 5:49.32 total
So about a 2.4x speedup by dropping indexes and using psycopg3, but still considerably slower then sqlite.
I think we've addressed this to the best of our ability in v0.10.0
Creation of an sqlite database completes in just a few minutes; I suspect that the bottleneck for the postgres load is the foreign key indexes. See if we can create the indexes after the initial data load.