apache / datasketches-postgresql

PostgreSQL extension providing approximate algorithms based on apache/datasketches-cpp
https://datasketches.apache.org
Apache License 2.0
84 stars 11 forks source link

About Aod_sketch performance #46

Closed developerwxl closed 1 year ago

developerwxl commented 2 years ago

Hi~ I want to use aod_sketch in our company project. But i meet performance issues that use aod_sketch. I run this query in the PostgreSQL with version postgresql10-10.9-1PGDG.rhel7.x86_64.rpm. And the machine is AWS ec2, Redhat, C5.9Xlarge ( 36 vCPUs, 72G memory). And the data size is very small. Data size: Just one 4k sketch data intersection one 4K sketch data. But the SQL runs for almost 10 seconds. Are there any performance know issues ? Or other parameters that can improve the performance. Did you meet the same problem? SQL like this:

with
a as(select sk from test_ds_sketch where id ='3113'),   # just one record
b as ( select sk from  test_ds_sketch where id='2662')  # just one record
,
c as (select 171213 as network_id,aod_sketch_get_estimate(aod_sketch_intersection(a.sk,b.sk))/aod_sketch_get_estimate(a.sk)
as ratio from  a,b),
select * from c
AlexanderSaydakov commented 2 years ago

I don't have an answer, but a few suggestions:

developerwxl commented 2 years ago

Thank you for your response. I'm going to try it. Then send you a response.

AlexanderSaydakov commented 1 year ago

@developerwxl did you have a chance to gather some more information about this?