vitaly-t / pg-promise

PostgreSQL interface for Node.js
https://vitaly-t.github.io/pg-promise
MIT License
3.45k stars 217 forks source link

Prepend/append a statement before/after each query #100

Closed jmealo closed 8 years ago

jmealo commented 8 years ago

I would like to prepend/append a statement to each query issued that sets and then clears a GUC variable containing the session_id of the logged in web user.

If the use case is unclear, it's to enable the following pattern: https://gist.github.com/jmealo/4dc1c59b8d31009f9262

vitaly-t commented 8 years ago

If it is ok for you to throw in an extra variable into each query, you can add your own Custom Formatting Type that would simplify the query addition, plus do the extra stuff you want.

Or else, if you do not want to modify queries and want it to happen for every single query, you can use event extend to introduce your own query method that can pre-format each query.

And this wouldn't be a feature request, you can accomplish it easily within the existing version of the library.

vitaly-t commented 8 years ago

@jmealo Note to you, since we last discussed this, a new feature was added to the library, which may also come of help, depending on which way you will decide to do it: as.format has been extended with parameter options.

P.S. Anyhow, I'm done for today, till tomorrow now)

vitaly-t commented 8 years ago

@jmealo No feedback?

jmealo commented 8 years ago

I'll check this out tomorrow... Thanks!

vitaly-t commented 8 years ago

@jmealo Anything? You can re-open it when you have something.

jmealo commented 8 years ago

@vitaly-t I do not want to add a query parameter to each query.

My issue is, on each incoming request to my API, I attach a pg-promise insance. I will need each request to prepend/append session data related to the request object. I need to have that in scope and continue to share the connection pool.

Is this possible?

Just know that this is a pattern in Koa, not something exotic that I'm doing in express or with bare the core http server.

jmealo commented 8 years ago

@vitaly-t I cannot re-open an issue in this repo; I can just comment.

Additionally, I do recall that I tried to "fix" this by wrapping all the pg-promise methods to perform this and ran into an issue that the object did not allow modifications to be made.

For performance reasons, I'd really prefer not to wrap this in a Proxy, but it would be a work around.

vitaly-t commented 8 years ago

I gave you 2 possible approaches earlier. Have you tried them?

jmealo commented 8 years ago

Yes, I tried them last time that we discussed this, they lack the necessary scope to access the session object corresponding to the request. I sunk a few days into this the last time we discussed it couldn't find a way to do it via monkey patching; I would need to fork the library and make changes directly.

I was hoping now that I introduced a concrete use case that shows the state-fulness and needed context.

vitaly-t commented 8 years ago

they lack the necessary scope to access the session object corresponding to the request

When you add your own method via event extend it can take the session parameters and append/pre-prend the extra request you need.

And such methods always execute within the current connection session.

jmealo commented 8 years ago

extend is global though.

My node app has multiple pg-promise instances which maintain their own connection pool. Each instance connects as a different db role with a SEARCH_PATH and permissions clamped down to that tenant/schema.

Incoming HTTP requests are handled asyncronously, so even if I could extend and unextend the session context, the context sent to the service would always be the most recent request, not necessarily the one making the queries.

Does that make sense?

It looks like it may be possible within a task or transaction.

vitaly-t commented 8 years ago

Node.js enforces singularity of each modules loaded via require. It doesn't matter how many times you call it, you will end up with the same instance of the module. It is the db instance that would differ, if you are trying to represent different databases. And event extend will be called for each of the db instances that you have separately.

jmealo commented 8 years ago

I use generators that sleep when waiting on IO. Imagine an API endpoint that makes a SQL query, then a REST request to an external HTTP API followed by a second SQL query.

While that external API response is pending, other incoming HTTP requests to my API will use that pg-promise instance while the generator is sleeping because the SQL connection has been returned to the pool. (Correct?)

How I picture working around this: The session data would be SET to a custom GUC variable in the prepend and that GUC would be cleared/reset in the append. This is just for safety, it should be fine to only prepend.

In summary: Multiple requests share the pg-promise instance at the same time. It is not an instance per request. It is an instance per schema/tenant. The same pg-promise instance can be returned to the connection pool multiple times during a request.

That's why I need a way to always pass in the session context with each query, or somehow wrap it, so the context.pgp object which points to the pg-promise instance that request should be using will always set the GUC variables for that user to make sure audit logging, query logs, and row-level security rules match that user's web application session.

I'm not opposed to a hack, but I'm unsure where I can fit it in.

jmealo commented 8 years ago

Some possible solutions:

Eager to understand from someone who understands the internals the best way to accomplish this as it will literally impact every call to pg-promise in my app.

The query event seems to late in the pipeline as the final SQL string is already ready to process, so there would be no pass in the session.

vitaly-t commented 8 years ago

If I could pass in the request context into each query method and then extend it to look for that property and append/prepend to the query that could work.

So add such method via event extend and use it. I don't see the problem or relevance of the HTTP that you describe. As far as pg-promise is concerned, this is all very generic.

jmealo commented 8 years ago

How would I extend the event get the entire request object if I don't want to pass the request object as a query parameter (not a parameter to a query method, that would be fine, I don't want to pass it as an actual value that would go in as $1 in the SQL)?

All pre-defined methods and properties are read-only, so you will get an error, if you try overriding them.

If I can't redefine the extend before every single query that I execute, multiple times per request, I don't see how extend lets me do what I want to do? I don't see how I can keep scope to the request object since multiple requests use the pg-promise instance concurrently.

The lifecycle of my application is not one pg-promise instance per request.

Request middleware supplements each request context with a pre-initialized pg-promise instance. One is not created for each request.

jmealo commented 8 years ago

It looks like I could do it with QueryFile but it's locked down for internal use by the library. That's how most of my ideas/hacks have been thwarted.

vitaly-t commented 8 years ago

Your 2 options:

// with extra parameter:
var options = {
    extend: obj=> {
        obj.myMethodWithParam = (query, values, context)=> {
            // append/pre-pend query parameters, using the context;
            return obj.query(query, values);
        }
    }
};
// without extra parameter:
var options = {
    extend: function(obj) {
        obj.myMethodWithoutParameter = function(query, values) {
            // append/pre-pend query parameters, using either
            // a global context, or 'this' context.
            return obj.query(query, values);
        }
    }
};
jmealo commented 8 years ago

@vitaly-t: Thank you for the example. I don't think that I can do it without the extra parameter because we're re-using the instance and cannot call extend multiple times, but the extra parameter should work if I do this.

I'm going to try wrapping every query method and that should work... not sure if it'll work with tasks and transactions though.

Let me try that and I'll report back.

vitaly-t commented 8 years ago

If you want it to work through tasks and transactions, then perhaps relying on this parameter is better.

Alternatively, you can pass the context into task/transaction easily as either tag or this context.

jmealo commented 8 years ago

Sorry, I don't follow, what would that look like?

vitaly-t commented 8 years ago
// passing context via tag:
db.task(myTag, function (t) {

});

// passing context via this:
db.call.task(myContext, function (t) {
   // t.ctx.context = the calling context;
    var context = t.ctx.context;
});
jmealo commented 8 years ago

@vitaly-t Okay, that last bit makes sense. I'll give this a shot. Thanks

jmealo commented 8 years ago

I ended up just wrapping the basic query methods for now. I'm using my own transactions now (I hope that's sanctioned)... so I haven't had to use any of the built-ins for tx/tasks.

'use strict';

function PgpWrapper(pgp, ctx) {
    this.pgp = pgp;
    this.ctx = ctx;
}

/* This wrapper is verbose and minimalistic in order to optimise for:
    1. Garbage collection [each instance is just two references with methods on prototype, not each instance]
    2. Execution speed / V8 Optimization [ES6/ES5 meta-programming features are slow. V8 can optimise simple objects.
    3. Code auto-completion [explicitly defining the methods should work with most IDEs, looping over an array may not]
*/

PgpWrapper.prototype.none = function(query, values) {
    return this.pgp.none(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.one = function(query, values) {
    return this.pgp.one(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.many = function(query, values) {
    return this.pgp.many(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.oneOrNone = function(query, values) {
    return this.pgp.oneOrNone(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.manyOrNone = function(query, values) {
    return this.pgp.manyOrNone(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.any = function(query, values) {
    return this.pgp.any(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.result = function(query, values) {
    return this.pgp.result(this.ctx.guc() + query, values);
};

module.exports = PgpWrapper;
vitaly-t commented 8 years ago

This looks quite inefficient, and won't work within tasks.

I believe you said that if I use ES6 multi-line strings that I cannot use your custom goodies.

I don't believe I ever said that, as it's not true.

jmealo commented 8 years ago

@vitaly-t: I'm fairly certain that I need to wrap pg-promise for this to work as a shared instance of pg-promise between requests is not concurrency safe (if multiple requests are yielding).

The only way I can see getting around that is to pass state in somehow using custom formatting, which I'm still trying to wrap my head around.

If I want to do this 100% transparently, I believe wrapping is the way to go.

To make this work inside of tasks, could I just need to wrap this inside the task to also use the wrapped methods? I haven't read the code yet.

There's a low-level query method that'd be ideal to wrap but monkey patching isn't concurrency safe.

From a meta-programming approach, you'd use a Proxy that traps all function calls, swaps out query with a patched version that prepends the query no matter how pg-promise uses it (this would work within tasks). I think this could be concurrency safe because we're guaranteed run to completion within the trap but I haven't given it careful consideration as I don't think it would be performant/advisable.

jmealo commented 8 years ago

@vitaly-t:

I was able to wrap the task method, however, as you mentioned, it doesn't look like tx will be wrappable.

My testing included the queries being wrapped. I'm not sure if task behaves as documented when wrapped like this.

It might be a good idea for me to try to modify the unit tests to run with my wrapped version to check for regressions possibly.

'use strict';

require('generator-bind').polyfill();

function PgpWrapper(pgp, ctx) {
    this.pgp = pgp;
    this.ctx = ctx;
}

/* This wrapper is verbose and minimalistic in order to optimise for:
    1. Garbage collection [each instance is just two references with methods on prototype, not each instance]
    2. Execution speed / V8 Optimization [ES6/ES5 meta-programming features are slow. V8 can optimise simple objects.
    3. Code auto-completion [explicitly defining the methods should work with most IDEs, looping over an array may not]
*/

PgpWrapper.prototype.none = function(query, values) {
    return this.pgp.none(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.one = function(query, values) {
    return this.pgp.one(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.many = function(query, values) {
    return this.pgp.many(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.oneOrNone = function(query, values) {
    return this.pgp.oneOrNone(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.manyOrNone = function(query, values) {
    return this.pgp.manyOrNone(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.any = function(query, values) {
    return this.pgp.any(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.result = function(query, values) {
    return this.pgp.result(this.ctx.guc() + query, values);
};

PgpWrapper.prototype.task = function (p1, p2) {
    p1 = p1.bind({}, this);

    if (p2) {
        p2 = p2.bind({}, this);
    }

    return this.pgp.task(p1, p2);
};

module.exports = PgpWrapper;
vitaly-t commented 8 years ago

This approach will not work with tasks and transactions. Event extend is the only correct way to extend the protocol.

I never really understood the exact nature of the problem you were having in using event extend, and why the context didn't work for your there.

If you want my help in this, write a simplified app with a very simple concept of what you are doing and why it doesn't work the way you expected. Then I may be able to point you in the right solution.

Otherwise, we won't make any progress here.

jmealo commented 8 years ago

I'm totally for making a sample application, however, I don't have a way of even moving forward if extend is the only correct way to do this.

I do not want to extend the protocol, I need every query to at least once (more than once is ok) issue a SET statement. The contents of the SET statement is dependent on the incoming HTTP request's session information, which must be in scope. In order to be concurrency safe we cannot mutate a single shared copy of pg-promise.

Questions about extend:

  1. In the documentation, the extend example defines an options object. I assume that this is to pass to the pg-promise constructor on a per-instance basis. In which case, I would need a new pg-promise instance per incoming request because I need to be bound to the context of my request.
  2. How do I use extend to override the built-in methods so that all use of pg-promise? I'm not seeing how this would work.

I also do not see how extend allows me to override the behavior of the built-in methods.

The only thing I see that seems even remotely possible would be to make an SqlQuery type with a String prototype to keep a reference to the request. Then a custom formatter could pull the request off and prepend/prefix the query?

vitaly-t commented 8 years ago

In the documentation, the extend example defines an options object. I assume that this is to pass to the pg-promise constructor on a per-instance basis.

No. This refers to the global, once-off Initialization Options object for the entire library.

How do I use extend to override the built-in methods so that all use of pg-promise?

You do not override anything predefined in the protocol, you add new methods and properties that can extend the existing ones with extra or specific functionality.

I also do not see how extend allows me to override the behavior of the built-in methods.

I doesn't override anything, it extends.

vitaly-t commented 8 years ago

@jmealo I had another look at how this library handles its own global instance, or to put otherwise - how to support more more one instance.

I found some problems here, which may be in part what you were talking about. So, I have started working on it - to make it better, to support multiple instances properly.

This seems like a big change, but an important one.

I have just finished doing it for spex, which is used by pg-promise. That library has (had) the same problems, being unable to properly handle its own multi-instance mode.

vitaly-t commented 8 years ago

Update spex v. 0.4.2 has been released, with proper multi-instance support.

Now I can start doing it for pg-promise, with the purpose to completely isolate separately initialized instances of the library.

vitaly-t commented 8 years ago

@jmealo if you want to follow this up, I have created https://github.com/vitaly-t/pg-promise/issues/114

jmealo commented 8 years ago

@vitaly-t: I'm interested in getting this to work. I'm not sure if multiple instances would help me personally.

I would like to commend you for being so diligent in following up and trying to understand my use case. You're a real standup guy :+1:.

vitaly-t commented 8 years ago

All updates are here: https://github.com/vitaly-t/pg-promise/issues/114

Cheers! :)

vitaly-t commented 8 years ago

@jmealo this may or may not help you in your project, but version 3.8.0 received support for the database context, which further improves multi-database or status-full databases.

extend is global though.

It is not that global anymore, as the database context is now available there, same as everywhere else.