nordic-institute / X-Road-Metrics

X-Road Metrics is a tool for collecting, storing and analysing reporting data and metrics from an X-Road® ecosystem.
MIT License
6 stars 4 forks source link

Opendata memory usage optimization #82

Closed VitaliStupin closed 1 year ago

VitaliStupin commented 1 year ago

Currently opendata is fetching daily logs from DB without using server side cursors. That means that python application may need to load gigabytes of data info memory before it starts streaming the result to the browser.

The proposed solution is to use server side cursors: https://www.psycopg.org/docs/usage.html#server-side-cursors

After the change application would fetch data from DB in small batches, and would be able to run on a server with a low amount of RAM.

VitaliStupin commented 1 year ago

https://www.postgresql.org/docs/current/sql-declare.html "WITHOUT HOLD specifies that the cursor cannot be used outside of the transaction that created it" It is expected behavior that cursor without HOLD gets closed after COMMIT, because that ends current transaction. If we want to get rid of HOLD we would probably need some major rewrite of functions so that we would not use cursor that was defined in other function. Or maybe there are some other tricks to avoid that COMMIT that I did not find. As we open connection just for one query I see no problems with using HOLD, as cursor gets removed anyway when connection terminates at the end of the query.

wisecrow commented 1 year ago

@VitaliStupin Is it good to merge this PR or we need additional commits?

VitaliStupin commented 1 year ago

It should be good, we already have that change in production and we did not notice any problems.