lpsmith / postgresql-libpq

Low-level Haskell bindings for libpq
BSD 3-Clause "New" or "Revised" License
19 stars 18 forks source link

64 bit large object support #10

Open ghost opened 11 years ago

ghost commented 11 years ago

Do you have a plan to update postgresql-libpq to support the upcoming 64 bit large objects?

cf: http://www.postgresql.org/message-id/E1TKeMq-0007FK-J5@gemulon.postgresql.org

lpsmith commented 11 years ago

Thanks for the heads up. Probably, eventually.

Though honestly my feeling is that I'd eventually like to deprecate the current bindings to the lo_ functions, and instead implement large object support in higher-level bindings using execParams instead, especially once execParams is implemented with non-blocking C calls and the IO manager instead of blocking C calls.

lpsmith commented 11 years ago

Here's another relevant reference from pg-hackers, although I had kind of figured it out between my own investigations and a conversation on #postgresql.

ghost commented 11 years ago

Interesting. Will the same seek, and read functionality still exist in the hypothetical execParams binding? Also do you foresee any added overhead/reduced speed to the large objects? I ask because I have multi-gig media stored and accessed via the lo interface, and that at present I can stream at remarkable speeds through Yesod. I really cant afford to lose that.

On Tue, 2013-08-20 at 16:57 -0700, Leon P Smith wrote:

Thanks for the heads up. Probably, eventually.

Though honestly my feeling is that I'd eventually like to deprecate the current bindings to the lo_ functions, and instead implement large object support in higher-level bindings using execParams instead, especially once execParams is implemented with non-blocking C calls and the IO manager instead of blocking C calls.

— Reply to this email directly or view it on GitHub.

lpsmith commented 11 years ago

Yes, the same seek and read functionality will still exist in a hypothetical execParams-based large object binding. Although this doesn't appear to be documented particularly well, the seek pointer and all the lo_* functions are server-side constructs. (Except lo_import/lo_export, which differ on the client side versus the server side)

Out of curiousity, how are you currently using the lo functionality from Haskell? Are you using postgresql-simple's binding, postgresql-libpq's binding, or some other way?

As for overhead and speed, that's not obvious to me at this point. Using non-blocking C calls and the IO manager should save some context-switching overhead on the client side, and probably scale better in highly concurrent environments, as the GHC runtime would be responsible for scheduling green threads, instead of the kernel being responsible for scheduling kernel threads.

On the other hand, there is the issue of overhead added on the client side by Haskell, which I can't really quantify at this point, and the overhead added on the backend because the server would be parsing the function name each side. In that regard PQfn just sends a binary function oid while PQexecParams would be sending select loread($1,$2) each time. However, given that the pg folks have all but deprecated the fast-path function call protocol, and haven't added a non-blocking interface to the fast-path protocol in libpq suggests to me that they don't think this is that big of a deal. Whether they are correct in that assessment for your purposes remains to be seen.

So... would you be willing to help perform performance testing once I get around to working on these issues? (Though be warned, this isn't really a priority for me at the moment.) Also, if you find that the 64-bit functionality immediately important, I'm certainly willing to entertain pull requests.

Though if you end up writing bindings to the libpq functions in postgresql-simple, there are some unresolved issues regarding versioning that I have yet to work out to my satisfaction. Namely, I'd like to follow the PVP, and there is some newer functionality in libpq that I'd really like to make available (like PQescapeIdentifier and PQsetSingleRowMode), but I'd also like to support older versions of libpq. There is a little bit of discussion in #8 relating to this issue.

ghost commented 11 years ago

On Wed, 2013-08-21 at 15:07 -0700, Leon P Smith wrote:

Out of curiousity, how are you currently using the lo functionality from Haskell? Are you using postgresql-simple's binding, postgresql-libpq's binding, or some other way?

Straight up postgresql-libpq. Speedy bugger that one is. I wrap multiple connections using Data.Pool, and then the actual transactions are wrapped in conduits to ensure that all the toys get put away if an exception occurs. It's Yesod's idiom for such things, and it works nicely.

So... would you be willing to help perform performance testing once I get around to working on these issues? (Though be warned, this isn't really a priority for me at the moment.) Also, if you find that the 64-bit functionality immediately important, I'm certainly willing to entertain pull requests.

Cool. Will do. I also need to do some poking in postgresql-simple to fix the problem of schema names getting mangled into table names during the quoting process. Yesod uses postgresql-simple and it's rather unpleasant to discover that schema designations are mishandled.

Though if you end up writing bindings to the libpq functions in postgresql-simple, there are some unresolved issues regarding versioning that I have yet to work out to my satisfaction. Namely, I'd like to follow the PVP, and there is some newer functionality in libpq that I'd really like to make available (like PQescapeIdentifier and PQsetSingleRowMode), but I'd also like to support older versions of libpq. There is a little bit of discussion in #8 relating to this issue.

Gotcha!

lpsmith commented 11 years ago

Cool. Will do. I also need to do some poking in postgresql-simple to fix the problem of schema names getting mangled into table names during the quoting process. Yesod uses postgresql-simple and it's rather unpleasant to discover that schema designations are mishandled.

Of course, I don't exactly understand what you are trying to do here, but this doesn't particularly sound like a postgresql-simple issue. Are you using persistent on top of postgresql-simple?

ghost commented 11 years ago

I believe it was related to this: https://github.com/lpsmith/postgresql-simple/issues/65

Also it could be a Persistent problem as well.

On Thu, 2013-08-22 at 07:00 -0700, Leon P Smith wrote:

    Cool. Will do. I also need to do some poking in
    postgresql-simple to fix
    the problem of schema names getting mangled into table names
    during the
    quoting process. Yesod uses postgresql-simple and it's rather
    unpleasant
    to discover that schema designations are mishandled.

Of course, I don't exactly understand what you are trying to do here, but this doesn't particularly sound like a postgresql-simple issue. Are you using persistent on top of postgresql-simple?

— Reply to this email directly or view it on GitHub.

lpsmith commented 11 years ago

Well, postgresql-simple doesn't have any particular support for parameterizing table/schema names. That requires dynamic sql generation of a sort you'd have to perform yourself.

This is something I'd like to fix, but doing it properly does require PQescapeIdentifier. The issue you linked to is not correct, and the workaround I suggested was just a quick (and insecure) hack.

ghost commented 11 years ago

Got it. With all of the modules stacked on top of one another in Yesod it can be quite an adventure unraveling the source.

On Thu, 2013-08-22 at 20:09 -0700, Leon P Smith wrote:

Well, postgresql-simple doesn't have any particular support for parameterizing table/schema names. That's something you'd have to do yourself as part of dynamic sql generation.

This is something I'd like to fix, but doing it properly does require PQescapeIdentifier.

— Reply to this email directly or view it on GitHub.