OpenDRR / opendrr-api

REST API for OpenDRR data / API REST pour les données OpenDRR
MIT License
4 stars 7 forks source link

Speed up database writes with synchronous_commit=off (and full_page_write=off and fsync=off?) #77

Open anthonyfok opened 3 years ago

anthonyfok commented 3 years ago

2021-11-22 update: This issue was essentially fixed in PR #105 (Pipeline optimization, Sprint 34–36) merged in June 2021; see [https://github.com/OpenDRR/opendrr-api/issues/77#issuecomment-975802148](reasons why I am keeping this issue open for the time being]


docker-compose up --build currently takes several hours to run. In particular, importing data into PostgreSQL database seems somewhat slow to me, apparently synchronizing to the physical disk for each single operation.

It turns out there are PostgreSQL settings to optimize its speed, and some users report like 10x speed improvement for e.g. CI test runs.

anthonyfok commented 3 years ago

Hi @drotheram, please assign me to this issue (and add labels, link to project, etc.) when you have time. Many thanks!

Tasks:

Relevant links:


Related links (to be sorted):

eatmydata (no, it doesn't work with PostgreSQL

Misc:

anthonyfok commented 3 years ago

One place to test this during psql: Running Create_all_tables_update.sql...:

Without synchronous_commit=off etc.:

python-opendrr_1         | 2021-05-20T18:13:20.046554364Z UPDATE exposure.canada_exposure SET geom = st_setsrid(st_makepoint(SauidLon,SauidLat),4326);
python-opendrr_1         | 2021-05-20T18:22:10.956967658Z UPDATE 2237515

With synchronous_commit=off etc.:

[TODO]

anthonyfok commented 3 years ago

Reminder (to self): Of the three parameters, only synchronous_commit can be set with ALTER DATABASE, and that was done in add_data.sh in PR #105:

$ psql opendrr -a
psql (13.3 (Debian 13.3-1))
Type "help" for help.

opendrr=> ALTER DATABASE opendrr SET fsync TO off;
ERROR:  parameter "fsync" cannot be changed now
opendrr=> ALTER DATABASE opendrr SET full_page_writes TO off;
ERROR:  parameter "full_page_writes" cannot be changed now
opendrr=> ALTER DATABASE opendrr SET synchronous_commit TO off;
ALTER DATABASE
opendrr=> 

So, if we want to set fsync and full_page_writes to off too, that will need to be done via #94, pending more benchmark and ways to ensure the database is fully written to the disk at the end of the docker-compose run.

anthonyfok commented 2 years ago

Issue status

synchronous_commit = off has been implemented in the set_synchronous_commit() function in from add_data.sh:

Why am I keeping this issue open:

To revisit in year 2022. :-)