Collect and report metrics from the driver side

PeterCsalaHbo commented 2 years ago

Is this library expose some metrics like

execution time in milliseconds
number of records out
number of bytes out
number of success requests
etc. ?

davecramer commented 2 years ago

Currently there is not

PeterCsalaHbo commented 2 years ago

@davecramer Is there any plan for the near future?

davecramer commented 2 years ago

I have been contemplating it, but time is a challenge.

@PeterCsalaHbo can you provide a wishlist here ?

PeterCsalaHbo commented 2 years ago

Basically the above four would be enough I think.

We can monitor the incoming traffic from the PostgreSQL cluster perspective but we can't monitor it from the application server perspective.

davecramer commented 2 years ago

So the challenge is that while you can get some numbers from the client, you need to know the server numbers to figure out the round trip overhead and the client overhead

PeterCsalaHbo commented 2 years ago

Yes exactly, if we know both the postgres client's and the server's numbers then we can better locate where the problem could reside.

davecramer commented 2 years ago

FYI Postgres is never referred to as Postgre .

PeterCsalaHbo commented 2 years ago

Apologize

PeterCsalaHbo commented 2 years ago

I think I killed this feature request with a typo ...

vlsi commented 2 years ago

Let me reopen this, as I think the feature will be useful.

vlsi commented 2 years ago

There are driver-managed structures (e.g. number of server-prepared statements, number of SQL statement cache misses) that might be interesting to monitor.

The tricky question is what do we use for collecting and reporting the metrics.

Relevant libraries are:

I'm not fond of reimplementing our own metric facade, and I'm not fond of breaking tons of applications with cases like "micrometer version from pgjdbc 42.x is not compatible with the one that comes with Spring Boot 3.y".

Any thoughts on the way we could collect metrics, expose them and avoid classpath clash? Should we shade a metric collection library and expose the values via JMX?

davecramer commented 2 years ago

If we wanted to remain agnostic I would expose listeners. This however would be PostgreSQL specific as the JDBC API has no such facility

sehrope commented 2 years ago

+1 to exposing our own interfaces and let the end user bridge that to their own collector. That would give us the ability to make it match whatever level of low level detail we'd like to provide.

Perhaps we could even mark the interface itself as "experimental" to have the first couple releases maintain the flexibility / luxury of changing the interface if real world usage changes. Breaking things after they're in the field is never a good thing, but for something entirely new like this keeping that option might be worthwhile.

svendiedrichsen commented 2 years ago

IMHO it would be great if the listeners would be notified asynchronously.

davecramer commented 2 years ago

IMHO it would be great if the listeners would be notified asynchronously.

That would mean we would have to be opinionated on what we sent it. The listener can choose to acquire whatever data it wants and then send it on asynchronously to avoid blocking if this is a requirement.

const commented 2 years ago

Please change time units to microseconds. java.util.time.Instant provides that time.

Also, it makes sense to organize measurement as listeners registered with driver or connection, that target output to needed data target: JFR special file or standard api, logging, or some custom output.

I think the following events are needed (time and thread when happened for each event, only additional parameters specified, id should be used to correlate events):

Connection
- open
- close (duration)
- First statement in transaction
- transaction commit success or fail (duration of transaction)
- transaction explicit rollback (duration of transaction)
- autocommit changed
- XA
- Connection joined transaction
- Connection left transaction
Update preparead statement
- prepare (statement, number of aguments)
- Batch execution (batch size, batch number of elements)
- Update started (size and number of input parameters, statement)
- Update finished (size and number of input parameters, number of records, duration)
Call statement
- parepare (statement, number of arguments)
- call started (size and number of input parameters, statement)
- call finished (size and number of input/output parameters, duration)
Query statement
- create (statement, number of agruments)
- query started (size and number of input parameters, statement)
- record received (offset in result, size, number of elements, time of reading record)
- query finished (size and number of input/output parameters, total result set amount of records, total result size, duration)

For non-prepared statement, prepare and start event need to be joined. Size of result set record, call result, and arguments calculated as it is serialized to output + overheads of driver format. So int8 + string of 20 ascii chars will be calculated as 8 (size of long) + 4 (length of text) + 20 (UTF8 byte size).

And it would nice to have a standard custom JFR output stream that records each statement with unique text only once. So log will be small and could be picked from servers later.

davecramer commented 2 years ago

Is there a reason you wouldn't use https://github.com/p6spy/p6spy for this ?

koriit-kontakt commented 9 months ago

Hi, if I could add to the wishlist, I would like to monitor the prepared statement cache - current size and possibly how many prepared statements are being created and/or evicted. The goal is to detect cache trashing, which was the root cause of severe performance issues in one of our databases.

pgjdbc / pgjdbc

Collect and report metrics from the driver side #2478