Open divega opened 7 years ago
@damageboy @roji are you cool with moving the conversation here? If yes, could you please summarize the current state of things?
Hey @divega, great to see things moving along.
Optimization efforts are currently taking place in Npgsql's dev branch, which will be released as 3.3.
Note that there are several ADO.NET-level API issues that also hurt Npgsql performance somewhat (e.g. https://github.com/dotnet/corefx/issues/25093), I'll post these here as well.
Npgsql builds on appveyor and travis, and CI nuget packages for dev are published to myget. The build is currently a bit unstable, I'll be working to stabilize things more in the coming weeks, but you should already be able to grab packages and test (it's best to check what commit they correspond to though).
@damageboy, maybe it's a good idea to post the latest performance run figures you got.
Hi all,
Here's where we are performance on the latest iteration of the benchmarks:
I ran the benchmarks on two physical machines:
I ran this ugly thing to get an overall glimpse of these machines:
uname -a; cat /proc/cpuinfo | grep "model name" | sort -u; lspci -d 1924: ; ethtool sfn0 | grep Speed; sysctl net.core.rmem_max net.core.wmem_max; echo dotnet --version: $(dotnet --version)
Linux client 4.4.0-96-generic #119~14.04.1-Ubuntu SMP Wed Sep 13 08:40:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
model name : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
03:00.0 Ethernet controller: Solarflare Communications Device 0a03 (rev 02)
03:00.1 Ethernet controller: Solarflare Communications Device 0a03 (rev 02)
Cannot get wake-on-lan settings: Operation not permitted
Speed: 10000Mb/s
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
dotnet --version: 2.0.0
Linux server 4.4.0-96-generic #119~14.04.1-Ubuntu SMP Wed Sep 13 08:40:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
model name : Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz
03:00.0 Ethernet controller: Solarflare Communications Device 0a03 (rev 02)
03:00.1 Ethernet controller: Solarflare Communications Device 0a03 (rev 02)
Cannot get wake-on-lan settings: Operation not permitted
Speed: 10000Mb/s
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
dotnet --version: 2.0.0
dpkg -l | grep postgres
:
ii pgdg-keyring 2017.1 all keyring for apt.postgresql.org
ii postgresql-9.6 9.6.5-1.pgdg14.04+1 amd64 object-relational SQL database, version 9.6 server
ii postgresql-client 9.6+185.pgdg14.04+1 all front-end programs for PostgreSQL (supported version)
ii postgresql-client-9.6 9.6.5-1.pgdg14.04+1 amd64 front-end programs for PostgreSQL 9.6
ii postgresql-client-common 185.pgdg14.04+1 all manager for multiple PostgreSQL client versions
ii postgresql-common 185.pgdg14.04+1 all PostgreSQL database-cluster manager
ii postgresql-contrib-9.6 9.6.5-1.pgdg14.04+1 amd64 additional facilities for PostgreSQL
I took @sebastienros benchmark code and made some small changes to make it more reporting friendly and added a bunch of benchmarks.
The code is here
Basically there are 4 sync + 4 async benchmark variants in my fork:
Variant | Sync bench name | Async bench name |
---|---|---|
Plain Vanilla | sync | async |
Keep Connection Obj | sync+conn | async+conn |
Keep Connection + Command Objs | sync+conn+cmd | async+conn+cmd |
Keep Connection + Command Objs + Pre-allocate results | sync+conn+cmd+prealloc | async+conn+cmd+prealloc |
What the each benchamrk does per "Transaction" is:
Benchmark | Description |
---|---|
Plain vanilla | Creates a new ADO.NET Connection object, sets the ConnectionString , calls Open{,Async}() the connection, Creates a Command, Sets the CommandText , Calls Prepare() , Calls ExecuteReader{,Async}() , iterates over the results with Read{,Async}() , calling ReadInt32(0) and ReadString(1) on each row |
Keep Connection Obj | Same as above, but the DbConnection obj is allocated outside of the benchmark loop, with its ConnectionString set and Open{,Async}() is called once for the entire benchmark duration |
Keep Connection + Command Objs | Same as above, but in addition the DbCommand obj is allocated outside of the benchmark loop with it's CommandText set and Prepare() called once for the entire benchmark duration |
Keep Connection + Command Objs + Pre-allocate results | Same as a above, but the results list + Poco objects are pre-allocated outside of the main benchmark loop. |
The connection string for the benchmark is pretty straight forward: https://github.com/damageboy/benchmarkdb/blob/b2ec07c7a48781df5772fa60d75562a094d5516d/BenchmarkDb/runme.sh#L11
But worth noting that:
Basically while there are not docker images for these benchmarks for now to run them you roughly need:
wget https://raw.githubusercontent.com/aspnet/benchmarks/dev/docker/postgres-techempower/create-postgres.sql
cat create-postgres.sql | psql -U benchmarkdbuser -H localhost bench
./runme.sh <number-of-seconds>
This will clone Npgsql, compile it, compile the benchmark, hotpatch Npgsql binaries, run the full suite of benchmarks on 1,4,8,16,32,64,128 threads and produce a results.md
+ results.csv
file which I used to produce the charts below...
So if we run roughly everything (Npgsql 3.2.5, 3.3 dev) on 1,4,8,16,32,64,128 threads with all benchmark variants we roughly get the following picture:
Things that stand out:
If we focus more on the latest Npgsql (dev branch) and async: Things that stand out:
Things not in the benchmarks but that I have personally noticed:
GetString(1)
adds another 10%+ to the TPS numbersdo it:
cd benchmarkdb/BenchmarkDb
PGPASSWORD=bench for t in 1 4 8 16 32; do pgbench -h 1.1.1.2 -U benchmarkdbuser -M prepared -j $t -c $t -T 60 -n -f pgbench.sql bench; done
Threads | TPS |
---|---|
1 | 18425.11 |
4 | 74685.26 |
8 | 156070.63 |
16 | 243891.20 |
32 | 268168.44 |
Here are my results on a controlled environment (Azure D3_v2 instances matching TechEmpower cloud environment). The baseline is ran using pgbench. Measures have been done on Windows Server 2016 DC. Load is generated by a separate machine on the same network.
Here I compared the current Nuget package with the Myget version (as of 11/17/2017)
Then after doubling the performance of the database server to see if it is a bottleneck.
@damageboy, @roji and @anpete have been discussing and looking possible changes to Npgsql as well as modifying the benchmark code to try to get better performance(see https://github.com/npgsql/npgsql/commit/c153d2772f9392851922e87b3f989031a152410b, [insert a reference to "whole command caching"], and https://gist.github.com/damageboy/556ece30cf2368212773381ec01a8c06).
The results so far seem very promising.
Creating this issue here so that we can hopefully centralize the conversation about them here (some of it has been going over email, but it would be better for them to be in the open).
It would also be great if we can incorporate any modified benchmark code in one repository (e.g. this one) so anyone can play with it and try new changes. I believe the last code from @damageboy is at https://github.com/damageboy/benchmarkdb.