Open xwkuang5 opened 1 year ago
It's probably best to lean on you or other contributors to provide this. I'm sure one of us could come up with a contrived example, but it's not clear to me it would be applicable to the real world.
Thanks for the prompt reply!
I can certainly help with that if you can provide some code pointers for adjacent area that I should look at. Right now I'm somewhat confused about what pgrx type the PG_FUNCTION_ARGS parameter, which the estimator function is supposed to take, should map to
I assume you're looking at a C example? What is that? I can probably illustrate how to port it to rust/pgrx.
yes, I'm looking at the postgres source code implementation for scalarltsel
/*
* scalarltsel - Selectivity of "<" for scalars.
*/
Datum
scalarltsel(PG_FUNCTION_ARGS)
{
return scalarineqsel_wrapper(fcinfo, false, false);
}
PG_FUNCTION_ARGS
seems to just expand to FunctionCallInfo fcinfo
#define PG_FUNCTION_ARGS FunctionCallInfo fcinfo
Link: https://doxygen.postgresql.org/selfuncs_8c_source.html#l01470
hmm. Yeah, okay, that one is a bit tricky. You'd want something like this:
unsafe fn extern "C" scalarltsel(fcinfo: pg_sys::FunctionCallInfo) -> Datum {
use pgrx::fcinfo::pg_getarg;
let root = pg_getarg::<pg_sys::Datum>(fcinfo, 0).cast_mut_ptr::<pg_sys::PlannerInfo>();
let operator = pg_getarg::<pg_sys::Oid>(fcinfo, 1);
let args = pg_getarg::<pg_sys::Datum>(fcinfo, 2).cast_mut_ptr::<pg_sys::List>();
let var_relid = pg_getarg::<i32>(fcinfo, 3);
...
}
This is untested, of course, but I think it starts to point you in the right direction?
I was taking that from src/backend/utils/adt/selfuncs.c
's scalarineqsel_wrapper()
function:
static Datum
scalarineqsel_wrapper(PG_FUNCTION_ARGS, bool isgt, bool iseq)
{
PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
Oid operator = PG_GETARG_OID(1);
List *args = (List *) PG_GETARG_POINTER(2);
int varRelid = PG_GETARG_INT32(3);
...
Thanks for the pointer! I will try this out
I was able to influence the optimizer's estimate of the number of output tuples via the following stub:
#[pg_operator(immutable, parallel_safe)]
#[restrict(fs_scalarltsel)]
#[opname(#<)]
fn fs_lt(lhs: FsValue, rhs: FsValue) -> bool {
not_null_nor_nan(&rhs) && is_same_type(&lhs, &rhs) && lhs.lt(&rhs)
}
#[pg_extern(sql = r#"
CREATE OR REPLACE FUNCTION fs_scalarltsel(internal, oid, internal, integer) RETURNS float8
STRICT
LANGUAGE c /* Rust */
AS '@MODULE_PATHNAME@', '@FUNCTION_NAME@';
"#)]
unsafe fn fs_scalarltsel(fcinfo: pg_sys::FunctionCallInfo) -> f64 {
use pgrx::fcinfo::pg_getarg;
let root = pg_getarg::<pg_sys::Datum>(fcinfo, 0).unwrap().cast_mut_ptr::<pg_sys::PlannerInfo>();
let operator = pg_getarg::<pg_sys::Oid>(fcinfo, 1).unwrap();
let args = pg_getarg::<pg_sys::Datum>(fcinfo, 2).unwrap().cast_mut_ptr::<pg_sys::List>();
let var_relid = pg_getarg::<i32>(fcinfo, 3).unwrap();
::pgrx::info!("operator oid: {:?}", operator);
// Vary this from 0.0 to 1.0 to influence the estimates
0.0
}
The thing that was missing was a custom sql
script to inform postgres about such an estimator function. While the following examples show this working end-to-end, it's difficult at this point to work with the input parameters (PlannerInfo
, List
, etc) given a lack of an easier to use API around them. I don't have a good example in my mind to show how such a custom estimator implementation can be useful (postgres' areasel
and positionsel
for geo data type are also just stub with hardcoded estimate at this point).
Regarding this issue, perhaps it's sufficient to close this as is. At least other people who are interested in such a use case can find the example below from the isslues list? What do you think?
0.0
pgfirestore=# explain select * from fs_documents where properties #< fs_number_from_integer(100);
WARNING: operator oid: Oid(127902)
QUERY PLAN
--------------------------------------------------------------
Seq Scan on fs_documents (cost=0.00..23.20 rows=1 width=64)
Filter: (properties #< fs_number_from_integer(100))
(2 rows)
1.0
pgfirestore=# explain select * from fs_documents where properties #< fs_number_from_integer(100);
WARNING: operator oid: Oid(127972)
QUERY PLAN
----------------------------------------------------------------
Seq Scan on fs_documents (cost=0.00..23.20 rows=880 width=64)
Filter: (properties #< fs_number_from_integer(100))
(2 rows)
Custom operators can provide a restriction selectivity estimator function via the
::pgrx::pgrx_macros::restrict
macro (postgres doc).After browsing the
pgrx
code base, I only found example use of::pgrx::pgrx_macros::restrict
with system-provided selectivity estimate function likeeqsel
,scalarltsel
. It would be great if we can provide an example of custom implementation of this function under thepgrx-examples
directory.For context, I am implementing a custom value type (called
fsvalue
) to model Cloud Firestore's data type. Repo: https://github.com/xwkuang5/pgfirestore. I'd like to experiment with adding selectivity estimator for the different operators (<
,>
, etc).Thanks!