TechEmpower / FrameworkBenchmarks

Source for the TechEmpower Framework Benchmarks project
https://www.techempower.com/benchmarks/
Other
7.62k stars 1.94k forks source link

Drogon and Postgres Batch Mode? #5502

Open cdbattags opened 4 years ago

cdbattags commented 4 years ago

Looks like there's something fishy going on with drogon's build process seen in line https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/C%2B%2B/drogon/drogon-core.dockerfile#L34 and below?

It contains https://github.com/an-tao/postgres/releases/tag/batch_mode_ubuntu which is by the same author as the drogon framework...? I can't quite trace how exactly this differs but seems questionable?

@an-tao, I don't want to go behind your back bringing this up. I only just want a conversation surrounding what you've done here.

an-tao commented 4 years ago

@cdbattags I use the batch mode patch (optionally) of libpq in drogon. I'm not the auther of it, the repository of postgres is forked from here Batch mode has a higher utilization of connections, so it can provide higher concurrent performance. I think there are many other frameworks that use this mode.

matt-42 commented 4 years ago

It's interessting to see that at these levels of performances, you need to patch the official sql drivers to go further. Did anybody know why they never merged this patch ? and why mysql did not implement it ? I'd like to use it for lithium but recompiling libpq from an outdated fork from 2016 to enable batch mode is tough. I'd prefer the user of my framework to use an up to date libpq that they don't have to recompile.

matt-42 commented 4 years ago

Just to link the libpq batch mode thread. Lots of information about it there:

https://www.postgresql.org/message-id/flat/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com

volyrique commented 4 years ago

Keep in mind that batching is expressly disallowed by the TFB rules (except for the database updates test and only for the updates themselves). However, pipelining is permitted (and used by actix and others) and indeed I firmly believe that implementations that do not use it are simply not competitive (for the top spots) in the database tests. It just so happens that this unofficial fork of libpq exposes both functionalities behind the same interface, but I believe that using batches of one query (in order to leverage pipelining) would not be a violation of the rules, and I hope that's precisely what drogon is doing.

I am also really interested in using this for h2o, but coding against what is essentially a work-in-progress patch for libpq is not my idea of providing a production-grade implementation (another one of the TFB guidelines). However, I wouldn't hold anyone else to the same standard.

matt-42 commented 4 years ago

I won't do it for lithium neither. @volyrique Do you have info about how to pipeline multiple sql requests ? Do you mean we can just call PQsendPrepare several time before starting to call PQgetResult ?

volyrique commented 4 years ago

It's impossible to do it with the upstream libpq - PQsendQueryPrepared() (or any other function) is going to return an error if it's called a second time while the first command hasn't completed. As for the fork - it should be possible according to the documentation, but I haven't bothered figuring out how because until those patches are merged, the interface may change significantly, so I don't see the point.

matt-42 commented 4 years ago

So do you mean that actix is also based on this fork ?

volyrique commented 4 years ago

No, actix, Vert.x and other frameworks that support pipelining don't use libpq (or any variant of) at all, which is expected because if your implementation language is not C, then most probably you would like to avoid calls to a C interface such as libpq; heck, some people feel the same way when coding in C++.

matt-42 commented 4 years ago

So at the end, the only solution to boost c++ frameworks is to start a postgres c++ driver from scratch that support pipelining (I may be too lasy for this...), or just wait that next versions of libpq supports it...

volyrique commented 4 years ago

If you are willing to write a database driver from scratch, you might as well just use the libpq fork - you are going to encounter the same issues. That is unless you really want a proper C++ PostgreSQL interface, in which case starting from scratch is justified (assuming that wrapping the libpq interfaces in classes and so on is not acceptable). The only real solution for me is to wait for the upstream. Since pipelining support has been on the TODO list for a while, I am hopeful that we are going to get it eventually. In the meantime I will just have to live with unimpressive performance numbers.

matt-42 commented 4 years ago

Agree. Anyway without pipelining, perfs are still pretty good.

matt-42 commented 4 years ago

For info I just pinged pgsql-hackers@postgresql.org about this

matt-42 commented 4 years ago

Mariadb may also work on it someday, this may close the gab between mysql and postgresql: https://jira.mariadb.org/browse/CONC-478

matt-42 commented 4 years ago

Seems like the "outdated fork problem" just become a "up to date fork" problem: https://github.com/an-tao/drogon/issues/526 Hope this will land in master soon. I've implemented a pipelined version of lithium, so at least it will be ready when the libpq team merges this patch.

@an-tao you can check my script if you want to upgrade your libpq fork (note 1: that they changed the API function names so you'll have to update your framework, note 2: it gives a nice performance boost :) ): https://github.com/matt-42/FrameworkBenchmarks/blob/lithium-upgrade8/frameworks/C%2B%2B/lithium/compile_clang-pipeline.sh#L11-L18

an-tao commented 4 years ago

@matt-42 Thanks so much. I'll try.

matt-42 commented 4 years ago

I took over the libpq batch mode patch and fixed the remaining issues. The patch is still waiting for review but hopefully it will get released with postgres 14...

https://commitfest.postgresql.org/29/2724/

dantti commented 3 years ago

The biggest issue I've found with pipelining and the reason I gave up caring for the time being is that you must take extra care with transactions. Two HTTP clients cannot share the same connection if one is to send a BEGIN command, but since my API above libpq creates a db pool that won't be hard to keep the db object if one started a transcation. Hoping for this to get in soon :)

matt-42 commented 3 years ago

It's true that it can cause very cryptic bugs if there is a yield (allowing another http connection to send an overlapping transaction) between a begin and a commit. In the current lithium implementation , if the user read the transaction results after committing the transaction, it's not possible that another coroutine send an overlapping transaction.

I finished the patch few weeks ago so it should be merged before the next pgsql release

matt-42 commented 3 years ago

@dantti @volyrique @an-tao @cdbattags the patch seems ok but the postgres team need some insight from potential users of the postgresql batch API. Could any of you have a look and give some feedback ? Thanks!

Here is the last patch: https://www.postgresql.org/message-id/attachment/115511/v23-0001-libpq-batch.patch The page where you can post your feedback (if you post it here I'll forward it): https://commitfest.postgresql.org/30/2724/

LifeIsStrange commented 3 years ago

WOW lipq batch/pipelining support is finally going to be merged this month!! my

feeling is that this is committable, so I'll be looking at getting this pushed early next weeks, barring opinions from others. https://www.postgresql.org/message-id/flat/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com

LifeIsStrange commented 3 years ago

@volyrique you'll be able to use it for h2o!

matt-42 commented 3 years ago

@LifeIsStrange it is already merged in master :) https://github.com/postgres/postgres/commit/acb7e4eb6b1c614c68a62fb3a6a5bba1af0a2659 pinging @an-tao

volyrique commented 9 months ago

Recently I have worked on a further improvement to libpq's pipeline mode, which now has been merged upstream. You might be able to use it in your own code.

cc @pavelmash