Open visch opened 2 years ago
I have run the specified tests. The first few rows from the dummy database I used are provided below.
last_name|first_name |street_address |email |index|_sdc_file_name |_sdc_line_number|_sdc_last_modified |_sdc_extracted_at |_sdc_received_at |_sdc_batched_at |_sdc_deleted_at|_sdc_sequence|_sdc_table_version|_sdc_sync_started_at|
---------+-----------+------------------------------+---------------------------+-----+---------------------------------+----------------+-----------------------+-----------------------+-----------------------+-----------------------+---------------+-------------+------------------+--------------------+
Butler |Chelsea |914 Davis Street |laura47@example.com |0 |/git/jsonl2postgres/data/1M.jsonl| 1|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.904|2023-11-24 21:46:52.904| |1700862412905| | 1700862412850|
Garrison |Raymond |66496 Pierce Burgs Apt. 300 |peterwoods@example.net |1 |/git/jsonl2postgres/data/1M.jsonl| 2|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.905|2023-11-24 21:46:52.904| |1700862412906| | 1700862412850|
Daniel |Nancy |276 Michael Oval |williamsmichele@example.net|2 |/git/jsonl2postgres/data/1M.jsonl| 3|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.906|2023-11-24 21:46:52.904| |1700862412906| | 1700862412850|
Combs |Christopher|314 Griffin Loaf |bharding@example.com |3 |/git/jsonl2postgres/data/1M.jsonl| 4|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.906|2023-11-24 21:46:52.904| |1700862412907| | 1700862412850|
Perry |Richard |9258 Dawson Manors Apt. 575 |xmoore@example.com |4 |/git/jsonl2postgres/data/1M.jsonl| 5|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.907|2023-11-24 21:46:52.904| |1700862412907| | 1700862412850|
Brennan |Erica |94620 Nicholas Avenue Apt. 600|amy59@example.net |5 |/git/jsonl2postgres/data/1M.jsonl| 6|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.907|2023-11-24 21:46:52.904| |1700862412907| | 1700862412850|
Molina |Brandon |9311 Juan Mission Suite 872 |veronica29@example.com |6 |/git/jsonl2postgres/data/1M.jsonl| 7|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.907|2023-11-24 21:46:52.904| |1700862412908| | 1700862412850|
Pratt |Alyssa |7118 Michael Hollow Suite 103 |snydertiffany@example.net |7 |/git/jsonl2postgres/data/1M.jsonl| 8|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.907|2023-11-24 21:46:52.904| |1700862412908| | 1700862412850|
Wood |Robert |884 Gaines Estates |suzanne29@example.org |8 |/git/jsonl2postgres/data/1M.jsonl| 9|2023-11-24 21:44:54.000|2023-11-24 21:46:52.687|2023-11-24 21:46:52.908|2023-11-24 21:46:52.904| |1700862412908| | 1700862412850|
Berry |Kara |20916 Paul Lights Apt. 284 |millergina@example.com |9 |/git/jsonl2postgres/data/1M.jsonl| 10|2023-11-24 21:44:54.000|2023-11-24 21:46:52.688|2023-11-24 21:46:52.908|2023-11-24 21:46:52.904| |1700862412909| | 1700862412850|
time pg_dump -U postgres -t melty.users | psql -U postgres postgres2
10,000
real 0m0.081s
user 0m0.025s
sys 0m0.029s
100,000
real 0m0.294s
user 0m0.064s
sys 0m0.021s
1,000,000
real 0m2.242s
user 0m0.110s
sys 0m0.270s
10,000
real 0m6.899s
user 0m6.167s
sys 0m0.392s
100,000
real 0m37.223s
user 0m42.277s
sys 0m1.705s
1,000,000
real 5m43.368s
user 6m57.207s
sys 0m13.646s
10,000
real 0m3.147s
user 0m1.524s
sys 0m0.099s
100,000
real 0m12.502s
user 0m15.688s
sys 0m1.751s
1,000,000
real 1m38.532s
user 2m24.671s
sys 0m16.911s
I like what you did here, I think we should make some kind of script that we can run to test this. Ideally we'd probably make a whole performance infrastructure to test over time and track but that seems a bit overkill.
Could you make a script python or something, drop it in the ./scripts dir here maybe performance_testing.py
or something that we can run that does all of these tests for us and outputs the results in a similar format.
I want to see what happens with the new performance update you made, and then it'd be nice to be able to "objectively" have folks run the test to see if we improved things or not.
Setup a test for how long it takes to load
Compare these to:
Programmatically would be best here