powa-team / pg_qualstats

A PostgreSQL extension for collecting statistics about predicates, helping find what indices are missing
Other
272 stars 26 forks source link

End trimmed - pg_qualstats_example_query() #24

Closed K12Tech closed 4 years ago

K12Tech commented 4 years ago

Postgres version 11.0

I have a problem with pg_qualstats_example() function. For some queries the end is trimmed. Mostly for the longer queries.

I will try not to change formattation. This should be exactly the same as seen in pg_stat_statements. Prepared statement query -

SELECT 
            (SELECT(get_user_name(su.school_usr_id))) as full_name,
            cps.class_name,
            scs.school_classes_id,
            u.usr_id,
            cps.transfer_type,
            (
                SELECT count(scs1.school_classes_id) FROM school_classes_students scs1 WHERE
                scs.school_classes_id = scs1.school_classes_id
            ) as total_students
        FROM
            users_v u
            INNER JOIN school_users su ON (u.usr_id = su.usr_id)
            INNER JOIN school_classes_students_v scs ON (su.school_usr_id = scs.school_usr_id)
            INNER JOIN close_period_students cps ON (scs.school_classes_students_id = cps.school_classes_students_id)
            INNER JOIN close_period_classes cpc ON (cpc.name = cps.class_name)
        WHERE
            now() BETWEEN u.start_date AND COALESCE(u.end_date,now()) and
            now() BETWEEN su.start_date AND COALESCE(su.end_date,now()) and
            now() BETWEEN scs.start_date AND COALESCE(scs.end_date,now()) and
            su.school_id = $1 and
            cps.transfer_type != $2 and
            cps.transfer_type != $3
            and cps.class_name in ($4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22)

Also the formatation should be the same as function's result. The result I get from the function looks like this -


        SELECT 
            (SELECT(get_user_name(su.school_usr_id))) as full_name,
            cps.class_name,
            scs.school_classes_id,
            u.usr_id,
            cps.transfer_type,
            (
                SELECT count(scs1.school_classes_id) FROM school_classes_students scs1 WHERE
                scs.school_classes_id = scs1.school_classes_id
            ) as total_students
        FROM
            users_v u
            INNER JOIN school_users su ON (u.usr_id = su.usr_id)
            INNER JOIN school_classes_students_v scs ON (su.school_usr_id = scs.school_usr_id)
            INNER JOIN close_period_students cps ON (scs.school_classes_students_id = cps.school_classes_students_id)
            INNER JOIN close_period_classes cpc ON (cpc.name = cps.class_name)
        WHERE
            now() BETWEEN u.start_date AND COALESCE(u.end_date,now()) and
            now() BETWEEN su.start_date AND COALESCE(su.end_date,now()) and
            now() BETWEEN scs.start_date AND COALESCE(scs.end_date,now()) and
            su.school_id = '792' and
            cps.transfer_type != 'STUDENT_DROPPED' and
            cps.transfer_type != 'STUDENT_GRADUATED'
            and cps.class_name in ('10.a','10.b','10.c'

As we can see the end part is missing - where in () function is not complete.

rjuju commented 4 years ago

Hello,

Yes, this is because the sampled query text are for now kept in shared memory, and the amount of shared memory that can be used is fixed at start time. pg_stat_statements removed this limitation some time ago by storing the query text in an external files, which we didn't do.

We added the same limitation as what you see in pg_stat_activity, ie. limit on track_activity_query_size. The only way to get the full query would be to change this setting to something high enough for your queryes, and restart postgres. Keep in mind that pg_qualstats will require track_activity_query_size * pg_qualstats.max bytes of memory.

K12Tech commented 4 years ago

Thank You for the fast response.

Did what You adviced me to do. But now I have other problems.

pg_qualstats_example() function sometimes does not replace prepared statement variable places with examples.

Prepared statement query -

select * from "grade_details" as "gd" inner join "timetable" as "t" on "gd"."timetableid" = "t"."timetableid" where "t"."school_id" = $1 and "gd"."start_date" between $2 and $3 and $5 BETWEEN gd.start_date AND COALESCE (gd.end_date, $7) and "t"."deleted_yn" = $4

Result -

        select * from "grade_details" as "gd" inner join "timetable" as "t" on "gd"."timetableid" = "t"."timetableid" where "t"."school_id" = $1 and "gd"."start_date" between $2 and $3 and '2019-08-21 11:27:11' BETWEEN gd.start_date AND COALESCE (gd.end_date, '2019-08-21 11:27:11') and "t"."deleted_yn" = $4

Preffered/predicatble values -

$1 - int $2 - date $3 - date $4 - boolean

Where does the function take/how it decides what kind of variable to put inside prepared statement query?

rjuju commented 4 years ago

pg_qualstats_example is just a raw copy of the first encountered input string for each different queryid. In case of prepared statement it means that you won't get something usable as-is, as the parameters won't get replaced ad the datatypes are not kept.

Do you have a specific problem you're trying to solve? I'm not sure what exactly could be done by pg_qualstats here, but maybe we can make things better.

K12Tech commented 4 years ago

I am trying to implement this.

https://www.percona.com/blog/2019/07/22/automatic-index-recommendations-in-postgresql-using-pg_qualstats-and-hypopg/

Unde the "Automatic Index Recomendations" is a function that uses pg_qualstats_example_query() (line 18 -

pg_qualstats_example_query(t.queryid) as query

). It put line 50 -

sql execute 'explain (FORMAT JSON) '||l_queries.query INTO l_bef_exp;

in exception catch to catch broken examples.

rjuju commented 4 years ago

I see. Yes, what's described here needs more work for non trivial cases.

We implemented an automatic index suggestion tool that should handle more cases and that work at the database level to suggest better indexes with powa (that also use pg_qualstats). The only limitation for this tool is for query having const in the SELECT part of a query. If you want to give it a try there's a demo at https://dev-powa.anayrat.info/server/1/database/tpc/overview/ (just click login and then "Optimize this database")

K12Tech commented 4 years ago

I guess there is nothing I can do to fix pg_qualstats_example_query() or do atleast something to get the same results...?

rjuju commented 4 years ago

You could try to inject the parameters using pg_qualstats() values, but I now realize that it's actually not possible. It was previously raised (see https://github.com/powa-team/pg_qualstats/pull/12) that for prepared statements, the constant location is dropped by postgres, so pg_qualstats() can't tell you where to inject it back. I proposed a fixed in postgres a couple years ago to keep the constant location for prepared statements / extended protocol, but unfortunately no one got interested.

K12Tech commented 4 years ago

Thank You for the support.

I am quite sure those weren't constants, but still, that is a problem. I guess I will have to improvise somehow. :smile:

rjuju commented 4 years ago

For the record, I now added a global index advisor in pg_qualstats using the same approach as powa, but without the need to collect data and install the UI. It obviously means that the suggestion will be done using what's currently sampled in pg_qualstats() rather than a bigger dataset, but that can still be quite useful in many cases.

You can see a quick example in the README, or refer to https://rjuju.github.io/postgresql/2020/01/06/pg_qualstats-2-global-index-advisor.html for more details on how it's working.