Open JerrySievert opened 2 years ago
Hi @JerrySievert .. Thank you for the kind words..
I think it's a great idea to add some hooks so this can be achieved. I'll give some thoughts to how that could be achieved best.
fantastic! I look forward to seeing the addition, thanks for considering it.
Hey @JerrySievert @porsager
I ended up taking a stab at instrumenting this library over here: https://github.com/tomsanbear/opentelemetry-instrumentation-postgres
This is really just a result of a few hours trying to grasp this library's internals and finding a hacky way to get the query statement, along with start and end span markers. I'd love some recommendations @porsager if you have better thoughts on how to hook into this library. So far it works, but I wouldn't put it anywhere close to production without some more testing.
@tomsanbear that's awesome, but is there some trick to getting it to work?
const {
PostgresInstrumentation
} = require('opentelemetry-instrumentation-postgres');
and then in my instrumentations when instantiating the sdk:
instrumentations: [
getNodeAutoInstrumentations(),
new HapiInstrumentation(),
new PostgresInstrumentation(),
new NetInstrumentation()
]
is there something else I need to do?
@porsager checking to see if there is any updates on supporting OpenTelemetry?
I'm thinking the following to cover various use cases:
options.stats: true or sql.stats() gives you the following properties on the query result
result.duration // the complete duration of the query
result.waiting // the time the query waited to be sent
result.execution // the actual execution time (query sent to pg until response)
Now that's fine for hooking up or logging if required, but for the opentelemetry case, I'm thinking an option like this:
const sql = postgres({
...,
onrequest() {
// query queued
return () => {
// query sent
return () => // query finished
}
}
})
I've got a branch with that setup here if you want to test it out https://github.com/porsager/postgres/tree/query-stats .. Let me know what you think:
What might be interesting would be to also provide hooks for different lifecycle events of the query;
i.e., for instance, there might be a high "time to connect to postgres" compared to "time of query execution"
That's exactly what the above is? π
Oh, haha, yes, sorry about that; I didn't pay enough attention π€¦ββοΈ Thanks for your library, by the way!
I'm thinking the following to cover various use cases:
options.stats: true or sql.stats() gives you the following properties on the query result
result.duration // the complete duration of the query result.waiting // the time the query waited to be sent result.execution // the actual execution time (query sent to pg until response)
Now that's fine for hooking up or logging if required, but for the opentelemetry case, I'm thinking an option like this:
const sql = postgres({ ..., onrequest() { // query queued return () => { // query sent return () => // query finished } } })
I've got a branch with that setup here if you want to test it out https://github.com/porsager/postgres/tree/query-stats .. Let me know what you think:
@porsager this is perfect! I think these 3 stats are exactly what is needed. Is there any plan to implement this feature?
My approach for manual instrumentation would be to wrap the sql
function returned by postgres()
in another function that first calls tracer.startSpan()
, then awaits the original sql
function, and then calls span.end()
with optional error handling.
Then the trace exporter would receive startTimeUnixNano
and endTimeUnixNano
(as defined in otlp-transformer/src/trace/types.ts).
I have tried to wrap the sql function but there is something I am missing. The code works great as long as the query does not use an embedded ${sql(...)}
in the tmeplate.
Essentially I would like to wrap the function to instrument to AWS Xray, but if I do anything to resolve the promise sql returns then I loosethe ability to return the correct "thing" from my wrapped funciton, requring a rewrite of all the code which tries to call the function.
If anyone has an example wrapper function to be able to instrument the query and time taken that would be amazing!
Did you try the https://github.com/porsager/postgres/tree/query-stats branch mentioned right above instead of trying to create a wrapper @mattsoftware ?
I have attempted to use it to no avail. I added this in my package.json dependencies...
"postgres": "https://github.com/porsager/postgres.git#query-stats"
and then updated my sql options to this...
const sql = postgres({
host,
port: databaseSecrets.port,
database: databaseSecrets.dbname,
username: databaseSecrets.username,
password: databaseSecrets.password,
ssl,
idle_timeout: 3,
max_lifetime: 60 * 15,
onrequest: () => {
console.log(
"======================= SQL REQUEST queued ======================="
);
return () => {
console.log(
"======================= SQL REQUEST sent ======================="
);
return () => {
console.log(
"======================= SQL REQUEST finished ======================="
);
};
};
}
});
expecting to see those console.logs happen during and after the sql call. No errors, but nothing logged either. If you can hint as to what I am missing that would be fantastic as I think this would be perfect for me to instrument my sql calls!
Thanks.
Hi @porsager First thanks for the work on this library, it's awesome!
Do you have plans to merge the query-stats branch? We would love to also try to instrument our DB calls with it.
One of the nice things about the postgres
package is that it has 0 dependencies π but if you don't mind adding a dependency on opentelemetry/api
package (which has 0 dependencies too) I can create a PR that adds instrumentation directly to the library exporting not only tracing but also exception events and prometheus-style metrics (like query duration histograms, table read/write counters, connection stats, etc...).
opentelemetry is the de-facto standard in instrumentation and is vendor neutral. the opentelemetry/api
package allows to instrument code that only gets activated when the hosting application registers concrete telemetry collectors by using the opentelemetry/sdk
package and is a no-op otherwise.
I believe this can bring a lot of value to users of the library since they will get meaningful instrumentation by default. Telemetry is often a component of an application that feature-developers are less involved with as they are building the system and in many cases it is added without changing anything in the application's code, see for example the auto-instrumentation of node.js applications. By supporting opentelemetry in the library we can allow these teams to receive great instrumentation out of the box.
Below is an image of one of our traces, I'm attaching it here to give some visual context that will explain how valuable this is. In the image is a request traveling through our different services and modules, as of now we cannot see the DB requests since we had no way to instrument postgres
but we imagine how insightful it would be if we cloud π
@neerfri What a generous offer. I hope it will be accepted by the library author. In the meantime, are you willing to maintain a fork of this library to get tracing implemented? I would most certainly use the fork! πββοΈ
@tobiasmuehl Thanks for your kind words and support β€οΈ Let's wait a few days for @porsager to respond to see if the general direction he wishes to rake is to integrate opentelemetry in the library or expose an instrumentation hook and integrate with the library using a wrapper, I would not want us to be working of a fork if the direction is not to integrate it via the library.
@neerfri Thanks for the kind words βΒ Yes, the onrequest
feature will be added once I find time to do more work on Postgres.js, and I hope the design is specifically so that it's very easy to add something like opentelemetry simply by having an eg postgres-opentelemetry
module that you pass to onrequest.
I see absolutely no need for including it as a dependency here if a standalone module can work fine using onrequest?
Would be cool to see how it's working for you using the query-stats branch.
I took at stab for our internal use at this. TL;DR is the onquery
API looks fine to me, and works as intended :+1:
After that, I added the following very crude code to our connection options:
import { SpanKind, SpanOptions, SpanStatusCode, Tracer, context } from "@opentelemetry/api";
import { hrTime, hrTimeDuration, hrTimeToMilliseconds } from "@opentelemetry/core";
import { SEMATTRS_DB_STATEMENT, SEMATTRS_NET_PEER_NAME } from "@opentelemetry/semantic-conventions";
type QueuedQuery = unknown;
interface Query extends PromiseLike<unknown> {
string: string;
}
type QueryResults = unknown;
type QueryQueuedHandler = (q: QueuedQuery) => QueryStartedHandler | void;
type QueryStartedHandler = (q: Query) => QueryResultHandler | void;
type QueryResultHandler = (results: QueryResults) => void;
function tracingQueryHandler(tracer: Tracer, serverName: string): QueryQueuedHandler {
return () => {
const startTime = hrTime();
return (q) => {
const queuedTimeMs = hrTimeToMilliseconds(hrTimeDuration(startTime, hrTime()));
const spanOptions: SpanOptions = {
kind: SpanKind.CLIENT,
attributes: {
[SEMATTRS_NET_PEER_NAME]: serverName,
[SEMATTRS_DB_STATEMENT]: q.string,
"db.postgresjs.queued_time_ms": queuedTimeMs,
},
};
const ctx = context.active();
const span = tracer.startSpan("query", spanOptions, ctx);
q.then(
() => {
span.setStatus({ code: SpanStatusCode.OK });
span.end();
},
(e) => {
span.recordException(e);
span.setStatus({ message: e.message, code: SpanStatusCode.ERROR });
span.end();
}
);
};
};
}
my DB connection options now look like this:
import { trace } from "@opentelemetry/api";
const tracer = trace.getTracer("postgres"); // change the name to whatever makes sense for you
const opts = {
// other options...
onquery: tracingQueryHandler(tracer, `${hostname}:${port}`),
}
Here's an example of trace produced by the above, including only the relevant fields:
{
"parent_span_id": "3cb06a353a1e27db",
"scope_name": "postgres",
"service_name": "v2",
"span_attributes": {
"db.postgresjs.queued_time_ms": 0.054407,
"db.statement": "select \"space_id\", \"id\", \"parent_id\", \"name\", \"creator_id\", \"create_time\", \"update_time\", \"delete_time\" from \"space_folders\" where (\"space_folders\".\"space_id\" = $1 and \"space_folders\".\"parent_id\" = $2 and \"space_folders\".\"delete_time\" is null) order by \"space_folders\".\"name\"",
"net.peer.name": "127.0.0.1:25432"
},
"span_duration_millis": 11,
"span_end_timestamp_nanos": 1711451792875317000,
"span_fingerprint": "v2\u00003\u0000query",
"span_id": "b2a78b4cea9fc7fc",
"span_kind": 3,
"span_name": "query",
"span_start_timestamp_nanos": 1711451792864000000,
"span_status": {
"code": "ok"
},
"trace_id": "6fe69188c84b104b6229d78b679da79b"
}
Disclaimer: my understanding of OpenTelemetry is quite limited, so expect nonsense in all of the above.
Patch that can be applied using pnpm patch or similar:
From b1b1c03b46350b667286dc1f77a7ee104176881e Mon Sep 17 00:00:00 2001
From: Rasmus Porsager <rasmus@porsager.com>
Date: Mon, 26 Jun 2023 00:39:02 +0200
Subject: [PATCH] Add onquery handler
---
src/connection.js | 1 +
src/index.js | 2 ++
src/query.js | 30 +++++++++++++++++++++++++-----
3 files changed, 28 insertions(+), 5 deletions(-)
diff --git a/src/connection.js b/src/connection.js
index 7d97a4b..9f1989f 100644
--- a/src/connection.js
+++ b/src/connection.js
@@ -165,6 +165,7 @@ function Connection(options, queues = {}, { onopen = noop, onend = noop, onclose
: (query = q, query.active = true)
build(q)
+ q.onquery && (q.onquery = q.onquery(q))
return write(toBuffer(q))
&& !q.describeFirst
&& !q.cursorFn
diff --git a/src/index.js b/src/index.js
index 0573e2b..97de5a3 100644
--- a/src/index.js
+++ b/src/index.js
@@ -85,6 +85,7 @@ function Postgres(a, b) {
function Sql(handler) {
handler.debug = options.debug
+ handler.onquery = options.onquery
Object.entries(options.types).reduce((acc, [name, type]) => {
acc[name] = (x) => new Parameter(x, type.to)
@@ -491,6 +492,7 @@ function parseOptions(a, b) {
onclose : o.onclose,
onparameter : o.onparameter,
socket : o.socket,
+ onquery : o.onquery,
transform : parseTransform(o.transform || { undefined: undefined }),
parameters : {},
shared : { retries: 0, typeArrayMap: {} },
diff --git a/src/query.js b/src/query.js
index 0d44a15..bf1c087 100644
--- a/src/query.js
+++ b/src/query.js
@@ -13,6 +13,9 @@ export class Query extends Promise {
reject = b
})
+ this.resolver = resolve
+ this.rejecter = reject
+
this.tagged = Array.isArray(strings.raw)
this.strings = strings
this.args = args
@@ -23,19 +26,29 @@ export class Query extends Promise {
this.state = null
this.statement = null
- this.resolve = x => (this.active = false, resolve(x))
- this.reject = x => (this.active = false, reject(x))
-
this.active = false
this.cancelled = null
this.executed = false
this.signature = ''
+ this.onquery = handler.onquery;
- this[originError] = this.handler.debug
+ this[originError] = handler.debug
? new Error()
: this.tagged && cachedError(this.strings)
}
+ resolve(x) {
+ this.active = false
+ this.onquery && (this.onquery = this.onquery(x))
+ this.resolver(x)
+ }
+
+ reject(x) {
+ this.active = false
+ this.onquery && (this.onquery = this.onquery(x))
+ this.rejecter(x)
+ }
+
get origin() {
return (this.handler.debug
? this[originError].stack
@@ -137,7 +150,13 @@ export class Query extends Promise {
}
async handle() {
- !this.executed && (this.executed = true) && await 1 && this.handler(this)
+ if (this.executed)
+ return
+
+ this.executed = true
+ await 1
+ this.onquery && (this.onquery = this.onquery(this))
+ this.handler(this)
}
execute() {
@@ -171,3 +190,4 @@ function cachedError(xs) {
Error.stackTraceLimit = x
return originCache.get(xs)
}
+
--
2.44.0
hmm it seems though that in a "real" setup not all queries are logged... Not sure yet where the issue is :thinking:
we overwrite handler.onquery
, but the handler is shared among many queries
I updated the patch above to copy the onquery
handler on each Query
instance, that query can then do whatever it wants with it.
Dear @porsager,
Are there any plans to merge the https://github.com/porsager/postgres/tree/query-stats soon?
Thanks, Anton
sorry no plans right now
thanks @abustany
the patch, is working great (used yarn patch from yarn 4), applied on postgres
3.4.4, I had to change q.string
to q.strings.join("?")
here is my yarn patch: https://codeberg.org/devthefuture/modjo/src/branch/master/.yarn/patches/postgres-npm-3.4.4-90e4c53e1e.patch
and here is the implementation: https://codeberg.org/devthefuture/modjo/src/branch/master/plugins/postgres/tracing.js
Anything we can do to help move this forward?
hi,
wondering if you've considered either adding opentelemetry support directly, or adding hooks for pre/post query execution?
just looking for a good way to get telemetry into your awesome library - thanks!