apache / datasketches-postgresql

PostgreSQL extension providing approximate algorithms based on apache/datasketches-cpp
https://datasketches.apache.org
Apache License 2.0
85 stars 11 forks source link

could not find function "pg_theta_intersection_get_result" in file #66

Closed shubhamsachdeva89 closed 1 year ago

shubhamsachdeva89 commented 1 year ago

Hi Alexander & Jon,

After the release of dataskecthes 1.6.0, we are trying to update our existing datasketches from 1.3 to 1.6

We followed the same steps in the documentation https://github.com/apache/datasketches-postgresql. We downloaded apache-datasketches-cpp-4.1.0-src.zip, apache-datasketches-postgresql-1.6.0-src.zip from apache datasketches website and boost 1.75.0 from boost.org.

Once we execute make command, it creates datasketches.so in apache-datasketches-postgresql-1.6.0-src folder. We copy this so file in our folder /pgbin/mbi1d/14.x/lib/postgresql where alter command picks up the so file.

Then, we execute the make install and it installs the dataksketches.so

make install

/bin/mkdir -p '/pgbin/mbi1d/14.x/lib/postgresql' /bin/mkdir -p '/pgbin/mbi1d/14.x/share/postgresql/extension' /bin/mkdir -p '/pgbin/mbi1d/14.x/share/postgresql/extension' /bin/install -c -m 755 datasketches.so '/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so' cat sql/datasketches_cpc_sketch.sql sql/datasketches_kll_float_sketch.sql sql/datasketches_kll_double_sketch.sql sql/datasketches_theta_sketch.sql sql/datasketches_frequent_strings_sketch.sql sql/datasketches_hll_sketch.sql sql/datasketches_aod_sketch.sql sql/datasketches_req_float_sketch.sql sql/datasketches_quantiles_double_sketch.sql > sql/datasketches--1.6.0.sql /bin/install -c -m 644 .//datasketches.control '/pgbin/mbi1d/14.x/share/postgresql/extension/' /bin/install -c -m 644 .//sql/datasketches--1.6.0.sql .//sql/datasketches--1.3.0--1.4.0.sql .//sql/datasketches--1.4.0--1.5.0.sql .//sql/datasketches--1.5.0--1.6.0.sql '/pgbin/mbi1d/14.x/share/postgresql/extension/'

Once it completes, we login to our database using psql and execute alter command as you suggested

mbi1d01=# alter extension datasketches update to '1.6.0'; ERROR: could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"

Please find the logs of make command attached. Could you please let us know if we are doing anything wrong? make_log.txt

AlexanderSaydakov commented 1 year ago

It seems to me that we need a special upgrade path from 1.3.0 to 1.6.0, since the default path 1.3.0 to 1.4.0 to 1.5.0 to 1.6.0 is not working. Perhaps, it might have worked if we versioned .so files, but we replace .so with the latest version, and some intermediate step refers to a function that does not exist in the latest version. I think I know how to fix this, but it will take me a while to set up the old version to test this.

AlexanderSaydakov commented 1 year ago

I managed to reproduce the problem. I will prepare a fix shortly.

shubhamsachdeva89 commented 1 year ago

Hi Alexander,

Thank you for the quick. I was able to upgrade the datasketches from 1.3 to 1.6.

mbi1d01=# SELECT * FROM pg_extension WHERE extname = 'datasketches'; oid | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition -------+--------------+----------+--------------+----------------+------------+-----------+-------------- 17085 | datasketches | 10 | 17081 | t | 1.6.0 | | (1 row)

I have one doubt. I tried to run below command as suggested in documentation to test the installation, but it failed with error

mbi1d01=# select cpc_sketch_to_string(cpc_sketch_build(1)); ERROR: function cpc_sketch_build(integer) does not exist LINE 1: select cpc_sketch_to_string(cpc_sketch_build(1)); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts.

cpc_sketch_build is present in /datasketches_cpc_sketch.sql but somehow it is still missing

While installing the so file, datasketches_cpc_sketch.sql was copied to datasketches--1.6.0.sql

make install /bin/mkdir -p '/pgbin/mbi1d/14.x/lib/postgresql' /bin/mkdir -p '/pgbin/mbi1d/14.x/share/postgresql/extension' /bin/mkdir -p '/pgbin/mbi1d/14.x/share/postgresql/extension' /bin/install -c -m 755 datasketches.so '/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so' cat sql/datasketches_cpc_sketch.sql sql/datasketches_kll_float_sketch.sql sql/datasketches_kll_double_sketch.sql sql/datasketches_theta_sketch.sql sql/datasketches_frequent_strings_sketch.sql sql/datasketches_hll_sketch.sql sql/datasketches_aod_sketch.sql sql/datasketches_req_float_sketch.sql sql/datasketches_quantiles_double_sketch.sql > sql/datasketches--1.6.0.sql /bin/install -c -m 644 .//datasketches.control '/pgbin/mbi1d/14.x/share/postgresql/extension/' /bin/install -c -m 644 .//sql/datasketches--1.6.0.sql .//sql/datasketches--1.3.0--1.4.0.sql .//sql/datasketches--1.4.0--1.5.0.sql .//sql/datasketches--1.5.0--1.6.0.sql .//sql/datasketches--1.3.0--1.6.0.sql '/pgbin/mbi1d/14.x/share/postgresql/extension/'

AlexanderSaydakov commented 1 year ago

I don't see how this is possible. This function existed in 1.3.0, and the upgrade script replaces it with a new version. You may want to look at the list of functions like this:

SELECT routine_name FROM information_schema.routines WHERE routine_name LIKE 'cpc_%';
shubhamsachdeva89 commented 1 year ago

I'm sorry I didn't specify schema. Its working.

mbi1d01=# \df+ mbi.cpc_sketch_build List of functions Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Source code | Description --------+------------------+------------------+---------------------+------+------------+----------+-------------+----------+-------------------+----------+-----------------+------------- mbi | cpc_sketch_build | mbi.cpc_sketch | anyelement | agg | immutable | safe | svcdmbi_adm | invoker | | internal | aggregate_dummy | mbi | cpc_sketch_build | mbi.cpc_sketch | anyelement, integer | agg | immutable | safe | svcdmbi_adm | invoker | | internal | aggregate_dummy | (2 rows)

Is there any change in algorithm of HLL from 1.3 to 1.6 ? Is there an improvement in the speed of calculating hll_sketch_union and hll_sketch_get_estimate?

AlexanderSaydakov commented 1 year ago

Yes, there is a substantial improvement to merging HLL sketches in the core library and also parallel query support in the PostgreSQL extension.