Open PBAtTheRingIo opened 11 months ago
Can you explain your scenario a bit more? What is the data model and the commands that I can use to reproduce what you are observing?
I created a simple table with 1 integer and 1 hll like for instance:
CREATE TABLE views_hll (
item_id integer NOT NULL,
hll hll(14,5,8,1) DEFAULT hll_empty(14, 5, (8)::bigint, 1) NOT NULL
);
then I use a loop to insert a bunch of UUIDs one by one in a loop (all uuids are different) thanks to a SQl request like this one:
UPDATE "views_hll" AS v0 SET "hll" = hll_add(hll, hll_hash_bytea($1)) WHERE (v0."item_id" = $2);
where $1 is a binary UUID and $2 the id (1 for instance and always 1 for each update).
I check the size after 1, 10, 100, 1_000_1000 update in doing for instance:
SELECT pg_total_relation_size('views_hll');
or
SELECT pg_size_pretty(pg_total_relation_size('views_hll'));
And I check also the value directly on my disk, and I see on the disk and thanks to the previous commands the value growing.
Hello, I do not know if it's the way to ask a question on this project, if not sorry about that. I'm using Debian GNU/Linux 12 (bookworm), Kernel Linux debian 6.1.0-9-amd64, PostgreSQL 15 from package 15.3-0+deb12u1, and the package postgresql-15-hll extension from package 2.17-1+b1. We would like to see the result of the cardinality after adding a lot of UUID in using the function "hll_add(hll, hll_hash_bytea(?)) with a binary uuid", and I see that my disk space is consumed by the table. Do you have any idea why, even if the HLL record is small (few kb), the result of "\d+" command and the size on disk can reach several Mb ? Do I have to configure PostgreSQL with dedicated parameters or do something else to not have this issue ? Thanks a lot in advance