jtv / libpqxx

The official C++ client API for PostgreSQL.
http://pqxx.org/libpqxx/
BSD 3-Clause "New" or "Revised" License
1.01k stars 239 forks source link

`query()` for prepared statements #644

Closed maresc-g closed 1 month ago

maresc-g commented 1 year ago

Hi,

We're looking to upgrade our version of pqxx in our application so I just came across the release note of the 7.7.4 in which you ask for feedback on if we still use pqxx::result.
I'd like to use the feature to automatically convert types like with query() but we only use prepared queries so we call exec_prepared() which still returns a pqxx::result. Is it planned to add this for prepared queries as well in future releases ?

Thanks

jtv commented 1 year ago

Thank you for letting me know!

Sounds like it would be helpful to have an equivalent to query but for prepared statements. And once we had that, the result class would become less relevant to you.

The result class is not going away, but I'd like to shift the focus of the API and documentation away from it, so that most users would never have to learn about it. I think it will simplify the most common path for learning and using libpqxx.

HackettJP commented 1 year ago

jtv,

Hi, i was so my yearly refresh and noticed the comment on the pqxx::result I use it on pqxx::work exec_prepared and pqxx::work exec statements , alot :) thks J.

jtv commented 1 year ago

jtv,

Hi, i was so my yearly refresh and noticed the comment on the pqxx::result I use it on pqxx::work exec_prepared and pqxx::work exec statements , alot :) thks J.

Thank you.

alexolog commented 1 year ago

Sounds like it would be helpful to have an equivalent to query but for prepared statements. And once we had that, the result class would become less relevant to you.

I'd like to add my vote to that. We are using prepared statements as well, and support for prepared/parametrized queries will be much appreciated. Until then, we are forced to use exec.

jtv commented 1 year ago

OK, I'll keep it in mind as a nice project to do next.

Let's rename this ticket from "pqxx::result feedback" to "query for prepared statements" so I can keep it open and be reminded.

alexolog commented 1 year ago

Non-prepared parameterized too please, for completeness :)

github-actions[bot] commented 1 year ago

There has been no activity on this ticket. Consider closing it.

alexolog commented 1 year ago

Bad bot!

jtv commented 1 year ago

Indeed. I have removed the label.

And who knows, I may find some time today to do actual work on this issue.

github-actions[bot] commented 1 year ago

There has been no activity on this ticket. Consider closing it.

github-actions[bot] commented 11 months ago

There has been no activity on this ticket. Consider closing it.

github-actions[bot] commented 9 months ago

There has been no activity on this ticket. Consider closing it.

alexolog commented 9 months ago

Don't know if the question is still relevant, but we're using pqxx::result quite a lot in our code.

jtv commented 9 months ago

Thank you @alexolog . Do you have any specific features that make pqxx::result useful to your use-cases than streaming queries? Is it mainly about knowing the result size, or perhaps about random access to the rows?

HackettJP commented 9 months ago

Hi, Mine is a loads stored procs of non streamed data, "prepared statements" , create GIS datasets
that return specfic data fields, also validate exectution of queries, multiple progs to build up GIS dataset maps alot of dynamic queries, return summary to capture in my Transaction tables of the execution of the quries.

I do also use exec as well, when i do not expect a result DDL commands like oracle/

J.

alexolog commented 9 months ago

Thank you @alexolog . Do you have any specific features that make pqxx::result useful to your use-cases than streaming queries? Is it mainly about knowing the result size, or perhaps about random access to the rows?

Knowing the result size, as well as the limitations that you documented:

They're less flexible than SQL queries, and there's the risk of losing your connection while you're in mid-stream

jtv commented 9 months ago

Thank you @alexolog . Do you have any specific features that make pqxx::result useful to your use-cases than streaming queries? Is it mainly about knowing the result size, or perhaps about random access to the rows?

Knowing the result size, as well as the limitations that you documented:

They're less flexible than SQL queries, and there's the risk of losing your connection while you're in mid-stream

Ahh right. Thanks. That says that there really is a place for both.

jtv commented 9 months ago

Mine is a loads stored procs of non streamed data, "prepared statements" , create GIS datasets
that return specfic data fields, also validate exectution of queries, multiple progs to build up GIS dataset maps alot of dynamic queries, return summary to capture in my Transaction tables of the execution of the quries.

Thank you. I really should support streaming of prepared statements as well... Though they aren't necessarily great for performance (because the database can't optimise for the parameter values).

I do also use exec as well, when i do not expect a result DDL commands like oracle/

That makes sense - in libpqxx there's even an exec0() function which reports an error if the query returns any data.

alexolog commented 9 months ago

It can be interesting to to look into the streaming. Can you document the advantages and limitations of that approach a bit better?

For example, does it work as well (or better) for queries that return a relatively small number or rows (single, tens, hundreds)? What kinds of queries it is not suitable for? (a phrase like "may not work form some queries, without elaborating, does not inspire confidence). Et cetera.

HackettJP commented 9 months ago

I pretty much use all your features :) streams as well for ingesting large data sets. into stageing tables

The DB's application design is very much a tradtional DWH setup, ELT or ETL etc with GIS + Graphs within the datamarts, years on OWB within oracle created a similair applicance using libpqxx as its glue :)

J.

jtv commented 9 months ago

@alexolog In #773 I extend the documentation for streaming. Hope that answers your questions.

jtv commented 9 months ago

@HackettJP that's cool to hear. I hope some day somebody will publish a library for handling PostGIS data types in libpqxx.

jtv commented 9 months ago

Mine is a loads stored procs of non streamed data, "prepared statements" , create GIS datasets that return specfic data fields, also validate exectution of queries, multiple progs to build up GIS dataset maps alot of dynamic queries, return summary to capture in my Transaction tables of the execution of the quries.

Thank you. I really should support streaming of prepared statements as well... Though they aren't necessarily great for performance (because the database can't optimise for the parameter values).

On second thoughts I'm not so sure that streaming prepared statements makes a lot of sense! The purpose of prepared statements is to reduce the overhead of parsing and planning a query. That overhead is usually tiny. A prepared query's actual run time is normally equal or worse to that of just executing the query ad hoc. So prepared queries make sense when the query planning cost is greater than the data handling cost: the query is complex but does not return a lot of data.

Streaming optimises in exactly the opposite direction. It increases the query's startup overhead, but speeds up data handling at scale. So it shines when the overwhelming cost is in data handling.

I'd wait for evidence that it's worthwhile before supporting a tradeoff where you save on planning overhead but add probably much greater streaming overhead, and then speed up large data transfers but risk slowing down execution.

github-actions[bot] commented 2 months ago

There has been no activity on this ticket. Consider closing it.

jtv commented 2 months ago

It's getting a little awkward making up all these different names for very similar functions. What if we had a marker type for prepared functions, and that would be the only difference between executing a prepared statement (where you pass the statement's name) and executing a regular statement (where you pass the SQL statement itself)?

Thinking of something like auto data = tx.exec(pqxx::prepared{"list_users"});

That would perhaps make it a bit easier to add prepared-statement versions of exec0(), exec1(), query(), stream(), and so on.

LucidApparition commented 2 months ago

Hello, I am new to libpqxx/pqxx and have hit a wall with executing a prepared query. This thread seemed to be the only relevant one I could find.

I am trying to check if a product ID exists in my table. When I run exec_prepared with the prepared query and parameter, it seems to crash. However, it runs fine if not given a placeholder.

cout << "Preparing statement." << endl;
C.prepare("check_product_id", "SELECT 1 FROM product_runs WHERE product_id = $1");
C.prepare("given_product_id", "SELECT 1 FROM product_runs WHERE product_id = 2024071101");

cout << "Running exec_prepared to check product_id: " << product_id << endl;

auto given_result = W.exec_prepared("given_product_id");
cout << "Given result: " << given_result.empty() << endl;

auto check_result = W.exec_prepared("check_product_id", product_id);
cout << "Check result: " << check_result.empty() << endl;

Terminal output:

Preparing statement.
Running exec_prepared to check product_id: 2024071101
Given result: 1

I'm not sure what is causing it to just stop. Any feedback is helpful. Thank you.

tt4g commented 2 months ago

@LucidApparition If the exception is throw inside pqxx, you can use try-catch to handle the exception, and you should be able to determine the cause from the exception message.

jtv commented 2 months ago

Strange... What's the type of the variable you're passing as the parameter?

HackettJP commented 2 months ago

something like this to think about , code_page or unicode stuff as well.

you may need to convert the variable or c.str() it. std::string product_id = pqxx::to_string(given_product_id_incoming);

a good approach to the query logs on postgresql backend logs check to see if it matchs up to what you sent, be mindfall and check if their any single quotes ' ' or double quotes

failing that is use the quote or escape depending etc..to see what you actualy sent in c++ land

std::cout <<)""" fooo barrr """+worker.quote(startDate)+""" foo barr ; """ << std::endl; or with table names """+worker.esc(targettempTable)+""" std::cout << """ UPDATE """+worker.esc(targetfinaltable)+""" fooor barr "") << std::endl;

pqxx::result response {};

try { .... C.prepare("check_product_id", "SELECT 1 FROM product_runs WHERE product_id = $1"); response= W.exec_prepared("given_product_id"); ..... ..... worker.commit(); .. catch (const pqxx::sql_error &e) { std::cerr << e.what() << std::endl; worker.abort(); exit(1);

    }
LucidApparition commented 2 months ago

@tt4g @jtv @HackettJP

Sorry about not providing that information. The code is in a try/catch and didn't catch anything. The product ID is an int And I ran it in debug mode and got this:

Exception thrown at 0x00007FFCC5BFFABC in AlliumCS.exe: Microsoft C++ exception: std::length_error at memory location 0x000000DB1378E8E0.
Exception thrown at 0x00007FFD54540003 in AlliumCS.exe: 0xC0000005: Access violation executing location 0x00007FFD5454003.

I believe this is just saying there is a pointer issue but unsure of how to fix it.

Here is the full code block of my function up to the issue:

// Function to check which faces have no value at product_id
vector<string> checkFaces(pqxx::connection &C, int product_id, int numSensors)
{
    vector<string> faces = {"top", "right", "bottom", "left"};
    vector<string> faces_with_no_value;

    try
    {
        // Begin a transaction
        pqxx::work W(C);

        cout << "Checking Faces..." << endl;
        lock_guard<mutex> guard(mtx); // Lock the mutex

        // Check if the product_id exists in the table using prepared statement
        C.prepare("check_productID", "SELECT 1 FROM product_runs WHERE product_id = $1");

        auto result_exists = W.exec_params("check_productID", product_id);
        int check_value = result_exists.empty() ? 0 : 1;
        cout << "Product ID exists: " << check_value << endl;
tt4g commented 2 months ago

I can't tell where the problem is from just the code you provided, but there are a few things I'm curious about.

  1. there is a breaking change in std::mutex and it crashes unless using Visual Studio Code runtime version 17.10.0 or later (microsoft/STL#4730), can you avoid using mutex?
  2. you are using using namespace std;, but this can break the behavior of the library, so please try not to use it.
LucidApparition commented 2 months ago

@tt4g I removed the mutex and namespace. And now only have this error regarding memory access.

Unhandled exception at 0x00007FFD01381043 in AlliumCS.exe: 0xC0000005: Access violation executing location 0x00007FFD01381043.

I also tested the data type to be sure and passed it as a string. Which allowed the program to continue through.

Starting program...
Starting measurement...
Device opened...
Measuring...
Reading Data...
Checking Faces...
Product ID: 2024071201
numSensors: 2
Checking if product ID exists..
Error checking faces with no value: bad array new length
Failed to insert/update attributes: bad array new length
Timestamp: Mon Jul 15 10:14:19 2024, Current Values: -1.745 -1.745 , Current Statuses: running running
Product runs updated successfully.

Yielding a different output but essentially the same problem.

This is my table for reference

CREATE TABLE product_runs(
    product_id integer NOT NULL,
    ...
    PRIMARY KEY(product_id)

The question I am having now, is, is it a problem with how I have my code structured or a problem with the libraries?

tt4g commented 2 months ago

The code provided does not appear to have any problems with the use of the library. Try commenting out the code a bit as you did with the two pieces of code and look for the cause of the crash.

LucidApparition commented 2 months ago

@tt4g As far as isolation goes, I have isolated it to being that one line of code not functioning with the parameter.

auto result = W.exec_params("SELECT 1 FROM product_runs WHERE product_id = $1", product_id);

I ran a test by bringing everything into my main script so that I wasn't passing any variables through functions or the connection string.

Entire main.cpp:

// PM-512 Main Function Entry Point
#include "measurementControl.h"
#include "dataProcessing.h"
#include <iostream>
#include <chrono>
#include <thread>
#include <pqxx/pqxx>
#include <sstream>

int main()
{
    std::cout << "Starting program..." << std::endl;

    // Set variables
    char *dbname, *dbuser, *dbpassword, *dbhostaddr, *dbport;

    // Check and retrieve environment variables
    if (_dupenv_s(&dbname, nullptr, "DB_NAME") != 0 ||
        _dupenv_s(&dbuser, nullptr, "DB_USER") != 0 ||
        _dupenv_s(&dbpassword, nullptr, "DB_PASSWORD") != 0 ||
        _dupenv_s(&dbhostaddr, nullptr, "DB_HOSTADDR") != 0 ||
        _dupenv_s(&dbport, nullptr, "DB_PORT") != 0)
    {
        std::cerr << "Error: One or more environment variables are not set." << std::endl;
        return 1;
    }

    // Create connection string using std::stringstream
    std::stringstream ss;
    ss << "dbname=" << dbname
       << " user=" << dbuser
       << " password=" << dbpassword
       << " hostaddr=" << dbhostaddr
       << " port=" << dbport;
    std::string conn_str = ss.str();

    // Set user inputed variables
    int product_id = 2024071601;
    double length = 152.4;
    std::string direction = "north";
    int numSensors = 2;

    try
    {
        pqxx::connection C(conn_str);
        if (!C.is_open())
        {
            std::cerr << "Failed to open database." << std::endl;
        }
        pqxx::work W(C);

        std::cout << "Testing query in Main..." << std::endl;

        std::cout << "Selecting from billet_attributes with hardcoded length..." << std::endl;
        auto attribute1 = W.exec_params("SELECT * FROM billet_attributes WHERE length = 152.4");
        std::cout << "Query passed with no error. Result size: " << attribute1.size() << std::endl;

        std::cout << "Selecting from product runs with hardcoded product id..." << std::endl;
        auto product1 = W.exec_params("SELECT 1 FROM product_runs WHERE product_id = 2024071601");
        std::cout << "Query passed with no error. Result size: " << product1.size() << std::endl;

        std::cout << "Selecting from billet_attributes with param length..." << std::endl;
        auto attribute2 = W.exec_params("SELECT * FROM billet_attributes WHERE length = $1", length);
        std::cout << "Query passed with no error. Result size: " << attribute2.size() << std::endl;

        std::cout << "Selecting from product runs with param product id..." << std::endl;
        auto product2 = W.exec_params("SELECT 1 FROM product_runs WHERE product_id = $1", product_id);
        std::cout << "Query passed with no error. Result size: " << product2.size() << std::endl;

        W.commit();
    }
    catch (const std::exception &e)
    {
        std::cerr << "Exception: " << e.what() << std::endl;
    }

    // Free allocated memory for environment variables
    free(dbname);
    free(dbuser);
    free(dbpassword);
    free(dbhostaddr);
    free(dbport);

    std::cout << "Program finished." << std::endl;
    return 0;
}

Terminal Output:

Starting program...                                                                                                 
Testing query in Main...
Selecting from billet_attributes with hardcoded length...
Query passed with no error. Result size: 11
Selecting from product runs with hardcoded product id...
Query passed with no error. Result size: 0
Selecting from billet_attributes with param length...

And it still crashes at the first line with a parameter.

public:
    _CONSTEXPR20 void reserve(_CRT_GUARDOVERFLOW size_type _Newcapacity) {
        // increase capacity to _Newcapacity (without geometric growth), provide strong guarantee
        if (_Newcapacity > capacity()) { // something to do (reserve() never shrinks)
            if (_Newcapacity > max_size()) {
                _Xlength();
            }

            _Reallocate<_Reallocation_policy::_At_least>(_Newcapacity);
        }
    }

Exception has occurred: W32/0xC0000005
Unhandled exception at 0x00007FF9FD508348 in AlliumCS.exe: 0xC0000005: Access violation executing location 0x00007FF9FD508348.
tt4g commented 2 months ago

That is very strange. What version of libpqxx are you using? Also, did you build it yourself?


By the way, did the following code show the crashed code? It does not appear to be libpqxx source code.

public:
    _CONSTEXPR20 void reserve(_CRT_GUARDOVERFLOW size_type _Newcapacity) {
        // increase capacity to _Newcapacity (without geometric growth), provide strong guarantee
        if (_Newcapacity > capacity()) { // something to do (reserve() never shrinks)
            if (_Newcapacity > max_size()) {
                _Xlength();
            }

            _Reallocate<_Reallocation_policy::_At_least>(_Newcapacity);
        }
    }
LucidApparition commented 2 months ago

@tt4g I built it using vcpkg and cmake. Version 7.9.0.

When I ran in debug mode it threw

Exception has occurred: W32/0xC0000005
Unhandled exception at 0x00007FF9FD508348 in AlliumCS.exe: 0xC0000005: Access violation executing location 0x00007FF9FD508348.

on

_Reallocate<_Reallocation_policy::_At_least>(_Newcapacity);

in that code block

public:
    _CONSTEXPR20 void reserve(_CRT_GUARDOVERFLOW size_type _Newcapacity) {
        // increase capacity to _Newcapacity (without geometric growth), provide strong guarantee
        if (_Newcapacity > capacity()) { // something to do (reserve() never shrinks)
            if (_Newcapacity > max_size()) {
                _Xlength();
            }

            _Reallocate<_Reallocation_policy::_At_least>(_Newcapacity);
        }
    }
tt4g commented 2 months ago

@LucidApparition This is the std::vector source code: https://github.com/microsoft/STL/blob/ecbc1efa09936f4ef5af529e770a95a29a4290d3/stl/inc/vector#L1680-L1690 If there is an access violation in that code, it is a bug in MSVC.

LucidApparition commented 2 months ago

Ok, thank you @tt4g I'll look into that.

tt4g commented 2 months ago

@LucidApparition I searched the MSVC STL repository, but could not find any defects that match this case. However, it is possible that it is a bug in the MSVC, so try updating the MSVC and see if the problem recurs.

LucidApparition commented 2 months ago

@tt4g Yes, I am currently updating my libraries and paths.

LucidApparition commented 2 months ago

@tt4g After updating everything, nothing has changed. I did tested a few different use cases for querying.

Using a query string, exec1, and exec0 seemed to work with passing the parameter. However, using a C.prepare with exec_prepared and exec_params both crashed.

std::cout << "Using raw SQL..." << std::endl;
std::string query = "SELECT 1 FROM product_runs WHERE product_id = " + W.quote(product_id);
auto rawSQL = W.exec(query);
std::cout << "Query passed with no error. Result size: " << rawSQL.size() << std::endl;       

Output:
Using raw SQL...
Query passed with no error. Result size: 0
std::cout << "Using exec1..." << std::endl;
W.exec1("SELECT 1 FROM product_runs WHERE product_id = " + W.quote(product_id));
std::cout << "Query passed with no error on exec1 " << std::endl;

Output:
Using exec1...
Exception: Expected 1 row(s) of data from query , got 0.
std::cout << "Using exec0..." << std::endl;
W.exec0("SELECT 1 FROM product_runs WHERE product_id = " + W.quote(product_id));
std::cout << "Query passed with no error on exec0 " << std::endl;

Output:
Using exec0...
Query passed with no error on exec0
C.prepare("check_product_id", "SELECT 1 FROM product_runs WHERE product_id = $1");
std::cout << "Using prepared..." << std::endl;
auto prepared = W.exec_prepared("check_product_id", product_id);
std::cout << "Query passed with no error. Result size: " << prepared.size() << std::endl;

Output:
Using prepared...
std::cout << "Using exec_params..." << std::endl;
auto execp = W.exec_params("SELECT 1 FROM product_runs WHERE product_id = $1", product_id);
std::cout << "Query passed with no error. Result size: " << execp.size() << std::endl;

Output:
Using exec_params...
tt4g commented 2 months ago

@LucidApparition Very strange, exec_params() has been confirmed to work in several tests.

https://github.com/jtv/libpqxx/blob/30d41bcdfa46aa278b034eb82aa078dcb671ce0e/test/unit/test_stream_from.cxx#L206-L237

https://github.com/jtv/libpqxx/blob/30d41bcdfa46aa278b034eb82aa078dcb671ce0e/test/unit/test_transaction_focus.cxx#L40-L49

However, I could not find any pattern in the test cases where double is passed as a parameter. Maybe there is a problem only when passing double. If you change the type of product_id to std::string instead of double, does the problem still occur? If not, you may want to open a new issue to manage this problem.

LucidApparition commented 2 months ago

@tt4g product_id is an int, my length variable is a double. I used that to see if it was an int issue or not. product_id is an int in the db but using the direction, which is a varchar(10), I get an exception.

int product_id = 2024071601;
double length = 152.4;
std::string direction = "North";
int numSensors = 2;
std::cout << "Using exec_params..." << std::endl;
auto execp = W.exec_params("SELECT 1 FROM billet_attributes WHERE top_direction = $1", direction);
std::cout << "Query passed with no error. Result size: " << execp.size() << std::endl;

Output:
Using exec_params...
Exception: bad array new length
Program finished.
tt4g commented 2 months ago

@LucidApparition This error looks like a std::bad_array_new_length message in MSVC STL.

https://github.com/microsoft/STL/blob/ecbc1efa09936f4ef5af529e770a95a29a4290d3/stl/inc/exception#L159-L162

It is strange that this error occurs. I am guessing that your application is doing some undefined behavior. Where did you get the libpqxx library you are using?

LucidApparition commented 2 months ago

@tt4g I installed it using vcpkg. The only thing that stands out with it is that my cmake finds the package but doesn't print out the directory it's in.

# Find libpqxx package
find_package(libpqxx CONFIG REQUIRED)
if(libpqxx_FOUND)
    message(STATUS "libpqxx include directory: ${libpqxx_INCLUDE_DIRS}")
else()
    message(FATAL_ERROR "libpqxx not found")
endif()

Output:
[cmake] -- libpqxx include directory: 
KayEss commented 2 months ago

Just a couple of notes on your design:

The bug you're trying to track down does look a lot like the sort of problem caused by mixing debug & release builds. Can you check that vcpkg does produce both a release and a debug version of the library, and you're linking against the correct one? In any case, can you try to see if the behaviour changes based on whether you're doing a debug or a release build of your code?

LucidApparition commented 2 months ago

@KayEss Thank you, very helpful. The product_id which is an int, atm, is the primary key not the double. To avoid any issues in the future, would you recommend changing it to a varchar? For the mixing builds, I was only creating one Debug build and not both a Debug and Release because of the issues.

For my commands I'm using

cmake .. -G "Visual Studio 17 2022"

msbuild /property:GenerateFullPaths=true /t:build /consoleloggerparameters:NoSummary /p:Configuration=Debug /p:Platform=x64 /p:VcpkgEnableManifest=true /p:PlatformToolset=ClangCL /p:ClCompile_AdditionalOptions="/fsanitize=address" "AlliumCS.vcxproj"
KayEss commented 2 months ago

I see you're using clang-cl. I don't know if that has full binary compatibility with MSVC. So you should check that the vcpkg build and your build:

KayEss commented 2 months ago

As you're using cmake, the simplest way to make sure everything is correct is to use add_subdirectory to include the libpqxx build. That way cmake sets all the build parameters and they're always correctly in sync

LucidApparition commented 2 months ago

@KayEss It looks like you were right on the issue being with the build. After deleting and rebuilding my build file, vcpkg, and cmaketext It looked like there was an issue with reading the libpqxx.lib file, finding the correct triplet, and possibly some other issues overall causing exec_params and other functions of pqxx to not work.

My issue has been solved in regards to using exec_params and exec_prepared.