Open dzhang-fec opened 4 years ago
Thanks @dzhang-fec to create this following ticket based on our 05/21 postmortem: https://docs.google.com/document/d/19vIM1JpwbevQarI5acUlH9O9kFeBZ-it7Rk26Pe-6JM/edit#
about DB long term strategy
update on this: we are fine on this so far on the election. We will keep to observe it when it is necessary (not chnage all or change according to the needs).
Background: There was a discussion on the balance of creating DB index and others (ETL slower on insert/update/delete at the 5/21 outage investigation. Per Jun, I am creating this ticket for discussion and review.
creating DB index advantage: is better for SELECT query to improve the GUI performance index is necessary for our kind of data warehouse application. usually index storage is about 4+ times more than the data itself.
creating DB index disadvantage: ETL slower on insert/update/delete get slower since the index need to be updated Database size increased (index storage is almost same with Db size now). cost$ Index needs to be maintenance and build index needs time (for aurora, large index needs 12mins but it was 40mins+ before. Jean had good way to do it: divide and concur.)
We think we need to discuss and test on index creations. e.g., create index if there is a DB crash risk or > 5mins+ execution time, and work together with sql code in python application, replica, resource sizing, etc. Also, we are checking index type : there are many types besides the tree index and DESC order related. Furthermore. all of the index creation are based production real index usage. We may change some index case by case.
case by case?
New index recently: pg_date index. We need it in case of table scan on x,xxx,xxx,xxx rows tables... couple of composite indexes helped the performance., etc.
Others: