Missing possibility to drop timeseries in shared memory

simovesterinen commented 5 years ago

It seems that we are not able to drop the unused timeseries from memory. We would need a new function that would drop the timeseries also from memory side not just in postgresql functions.

Our use case example is shown here:

Starting point (works fine):

create new table called "A1" in postgresql side
Insert data into Table "A1"
Create Imcs SELECT cs_create('A1',timestampid);
load data to imcs SELECT A1_load(FALSE);
Then users are querying from imcs A1

New column to table A1 (works fine):

Create new column to table A1 in postgresql side
Update data to column;
DROP imcs (not dropping the timeseries from memory side) SELECT A1_drop();
create imcs again to create the new column. SELECT cs_create('A1',timestampid);
load data to imcs new column. SELECT a1_load_column('new column here);

Data change of column (NOT WORKING):

Truncate data from memory: SELECT A1_truncate();
DROP imcs (not dropping the timeseries from memory side) SELECT A1_drop();
Change table column data type
create imcs again to create the new column. SELECT cs_create('A1',timestampid);
load data to imcs SELECT A1_load(FALSE); --> ERROR data type has changed --> we would also need a new function forexample: SELECT a1_drop_all() which would also drop the timeseries from memory side. If we restart the server after step 3 then it works.

So we would need a new function which would also drop the timeseries from memory, but we would need to keep the old drop function as it is. This would also help us in some other cases a lot.

We are using the table level locking branch

simovesterinen commented 5 years ago

We have also learned that if we drop many imcs tables using the current drop function, the loading data to new tables is getting really slow eventually, because the unused timeseries are not dropped in memory side. So cleaning would be needed.

knizhnik commented 5 years ago

Sorry, XXX_drop() function only deletes created wrapper functions, it doesn't affect data. To free memory you should use cs_delete_all() to remove all tables or XXX_truncate() function to delete particular table. Please notice that Postgres is not able to actually release hared memory. Once allocated it is been used by application. When you perform truncate of table, memory can be reused for new data.

I have committed patch to able-level-locking which makes XXX_truncate() also to delete entries from hash table, so there should not be a "data format was changed" error any more. I do not remember why I didn't remove entries before, just truncate correspondent B-Tree.

simovesterinen commented 5 years ago

It seems that there is some problem with the changes. We tried to run the install command which has previously worked, but now we got following errors.

sudo ./install.sh gcc -O3 -Wall -Wno-format-security -fPIC -I. -I./ -I/usr/include/postgresql/9.6/server -I/usr/include/postgresql/internal -Wdate-time -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/include/mit-krb5 -c -o imcs.o imcs.c imcs.c: In function ‘cs_delete_all’: imcs.c:5238:59: error: ‘HASH_DELETE’ undeclared (first use in this function) hash_search(imcs_hash, &entry->key, HASH_DELETE, NULL); ^ imcs.c:5238:59: note: each undeclared identifier is reported only once for each function it appears in imcs.c: In function ‘columnar_store_truncate’: imcs.c:5270:67: error: ‘HASH_DELETE’ undeclared (first use in this function) hash_search(imcs_hash, &entry->key, HASH_DELETE, NULL); ^ imcs.c: In function ‘columnar_store_truncate_column’: imcs.c:5306:67: error: ‘HASH_DELETE’ undeclared (first use in this function) hash_search(imcs_hash, &entry->key, HASH_DELETE, NULL); ^

: recipe for target 'imcs.o' failed make: *** [imcs.o] Error 1

knizhnik commented 5 years ago

Sorry, looks like I haven't committed the latest version. Please try now.

simovesterinen commented 5 years ago

Now after months of experience with the new functionality with the new patch in branch (table_level_locking) I would like to point out and ask one thing.

After hash deletion was added it seems that everything works correctly, but it takes a lot more time to truncate tables/columns. Also cs_delete_all() takes very long time to execute.

Instead of using ”hash_search(imcs_hash, &entry->key, HASH_REMOVE, NULL);" could we use "hash_destroy(imcs_hash);" to gain better performance?

knizhnik commented 5 years ago

hash_destroy function can be applied only fir hash created by hash_create in private process memory. By imcs_hash is created in shared memory using ShmemInitHash I wonder why removing from hash takes so long time in your case. Do you have so larger number of sequences? Can you prepare profile of tables/columns truncation in your case?

simovesterinen commented 5 years ago

I noticed for example that in one of the servers cs_truncate_all() is fast enought on sundays after we have executed maintenance vacuum full on the the database. It is taking more and more time always until vacuum full has been executed.

Just for your information. We use a lot cs_hash_sum function in our queries and i have been wondering that are hash tables cleaned only with vacuum full or because there is not that much queries during saturdays then sunday truncation is faster... just thinking.

So could there be some kind of leaking during the week that could be avoided?

knizhnik commented 5 years ago

Hmmm... It is very strange. Removing items from in-memory hash table has no relation to vacuum activity. Most likely slowdown is caused by some reason, for example lack of memory.

simovesterinen commented 5 years ago

One solution could be that we have a parameter (remove_hash boolean default true) for cs_delete_all(), tablename_truncate() and truncate_column() functions.

When we know that table structure has not changed and we are going to reload the data into table we truncate the table fast using false on the parameter.

What do you think?

simovesterinen commented 5 years ago

Just an though. In our application there are thousands of imcs queries using imcs hash aggregate functions (example: cs_hash_sum(input timeseries, group_by timeseries, out avg timeseries, out groups timeseries)).

When these queries are done will the builded hash table for the query get cleaned? This theory would suit well becouse our sunday morning truncation is fast because little use during saturdays... maybe... So if there are thousands of "orphan" hash tables in the shared memory that could slow down the removing (truncation) also?

knizhnik commented 5 years ago

One solution could be that we have a parameter (remove_hash boolean default true) for cs_delete_all(), tablename_truncate() and truncate_column() functions.

When we know that table structure has not changed and we are going to reload the data into table we truncate the table fast using false on the parameter.

What do you think?

Removing entry from hash can not be a bottleneck, so I do not understand how introducing remove_hash parameter may help to increase performance. I think that we should first monitor/profile the system in order to understand where most of the time is spent.

knizhnik commented 5 years ago

Just an though. In our application there are thousands of imcs queries using imcs hash aggregate functions (example: cs_hash_sum(input timeseries, group_by timeseries, out avg timeseries, out groups timeseries)).

When these queries are done will the builded hash table for the query get cleaned? This theory would suit well becouse our sunday morning truncation is fast because little use during saturdays... maybe... So if there are thousands of "orphan" hash tables in the shared memory that could slow down the removing (truncation) also? Yes, aggregate function using hash are temporary constructing this hash table during query execution and dropping it at end of the query.

simovesterinen commented 5 years ago

I seems you were right. During the night time used imcs memory (shared memory) gets swapped (17 GB of 22 GB gets swapped) during normal maintenance rutine database VACUUM. When cs_delete_all() is called later during the night then the SWAP starts to get smaller. When there is no swap then the cs_delete_all() is fast.

We have set maintenance_work_mem = 128MB, but still its using A LOT of memory during normal VACUUM. It seems that the system buffer area gets really big during the VACUUM and this pushes the imcs to SWAP. Any ideas? Would there be a smart way to warm the imcs memory after vacuum? I tried querying all the columns, but that took very long time.

knizhnik commented 5 years ago

IMCS has no relation with vacuum! IMCS maintains its own data structures in shared memory which are not affected by Postgres code and vacuum command. Most likely the reason of the problem is that shared buffers at your system are configured to the value which is much larger than size of available physical memory. This shared memory is used both by IMCS and Postgres buffers. Vacuum is intensively reading data so it is filling all available shared buffers. As a result all configured shared memory is used. If it is larger than available RAM size, than it will cause terrible swapping and dramatic slow down of IMCS work (just because instead of "in-memory" storage it will have to read data from swap).

simovesterinen commented 5 years ago

You were again totally right! We investigated heavilly how memory is used during VAUUM ANALYZE and noticed that when very large tables are vacuumed it uses a lot of memory so --> imcs gets swapped. When we fixed our maintenance rutine (imcs stays in actual memory) the cs_delete all is very fast :)

Case closed! :) Thanks!

knizhnik / imcs

Missing possibility to drop timeseries in shared memory #52