The production server, with "only" 108,728 indexed datasets (many more still haven't been migrated from the passthrough server), currently claims 84.1Gb of PostgreSQL storage just for the IndexMap table. Most of this consists of a list of each Opensearch document ID in order to allow using bulk update and delete operations to manage the index. This is straining the capacity of our RDU2 PostgreSQL server.
As an alternative, this PR removes the document list and instead of the bulk update and delete operations uses _delete_by_query and _update_by_query searching for documents in the appropriate indices (which we still store in the IndexMap) by parent dataset resource ID.
Along the way, I noticed that (oops) we were missing the "authorization" subdocument in some of our Elasticsearch documents, which would impact the authenticated search API behaviors. And I acted on a deprecation warning for a camelCase template keyword by replacing it with a snake_case alternative.
NOTE: In the interest of expediently deploying a fix for our SQL bloat in RDU2, this is missing unit testing for update and delete, both of which are tested (in indexed and non-indexed cases) by functional tests.
PBENCH-1315
The production server, with "only" 108,728 indexed datasets (many more still haven't been migrated from the passthrough server), currently claims 84.1Gb of PostgreSQL storage just for the
IndexMap
table. Most of this consists of a list of each Opensearch document ID in order to allow using bulk update and delete operations to manage the index. This is straining the capacity of our RDU2 PostgreSQL server.As an alternative, this PR removes the document list and instead of the bulk update and delete operations uses
_delete_by_query
and_update_by_query
searching for documents in the appropriate indices (which we still store in theIndexMap
) by parent dataset resource ID.Along the way, I noticed that (oops) we were missing the
"authorization"
subdocument in some of our Elasticsearch documents, which would impact the authenticated search API behaviors. And I acted on a deprecation warning for a camelCase template keyword by replacing it with a snake_case alternative.NOTE: In the interest of expediently deploying a fix for our SQL bloat in RDU2, this is missing unit testing for
update
anddelete
, both of which are tested (in indexed and non-indexed cases) by functional tests.