loopbackio / loopback-connector-postgresql

PostgreSQL connector for LoopBack.
Other
117 stars 180 forks source link

Multi-Column Database Indexes are Always Dropped on Autoupdate #464

Closed ong-james closed 3 years ago

ong-james commented 3 years ago

Steps to reproduce

  1. Create a multi-column index on a database model's schema. E.g. for a model, person, in model's definition JSON file:
    ...
    "indexes" {
    "unique_name_index": {
    "keys": { "firstName": 1, "lastName": 1 },
    "options": { "unique": true }
    }
    },
    ...
  2. Execute ds.autoupdate() where ds is the database data source
  3. Verify unique_name_index exists on the person table
  4. Execute ds.autoupdate() a second time

    Current Behavior

    Despite the fact that the index definition hasn't changed, loopback-connector-postgresql will still DROP the index for unique_name_index and recreate it. Obviously this is not great for performance as the table's data needs to be re-indexed even though the index hasn't changed.

Expected Behavior

loopback-connector-postgresql detects that unique_name_index exists and it's definition (i.e. keys, type, and unique-ness properties) have not changed, so the connector doesn't DROP the index.

Given that the index is recreated during the autoupdate() call, it's not easy to detect that the index is first dropped and then re-created. We discovered the issue when we externally added a partial index (since loopback-connector-postgresql doesn't support partial indexes) and attempted to have Loopback "skip" dropping and re-creating the index by defining an index on the model's schema in Loopback with the same name, keys, type, and uniqueness. We found that even though our index matched all the properties loopback-connector-postgresql uses to determine if an existing index in the database matches the model schema's index definition, the index was still dropped and then recreated using the model schema's definition (i.e. it dropped our partial index and then created a new index based-on the model schema's definition.) We recognise this is not a supported Loopback method of adding indexes; however, the underlying issue appears to affect all multi-column indexes.

See below for more details for debugging the issue.

Additional information

linux x64 12.17.0

+─ loopback@3.25.1
+─ loopback-boot@2.28.0
+─ loopback-component-explorer@2.7.0
+─ loopback-component-migrate@0.0.0-semantically-released.0 (github:luminlife/loopback-component-migrate#e96bc2ea144b2d6221d82303c0edda96584a9469) +─ loopback-connector-postgresql@5.2.0
+─ loopback-ds-readonly-mixin@2.0.4
+─ loopback-ds-timestamp-mixin@3.4.1
+─ loopback-jwt@1.0.2 (github:luminlife/loopback-jwt#5724ef89725aeb1a9b9704ce7d4ba023bc575913)

The code causing the issue appears to be here: https://github.com/strongloop/loopback-connector-postgresql/blob/ffb08f4ca60a711d544ce13996afebcf09b86712/lib/migration.js#L826

si.keys is an array of arrays built in the normalizeIndexDefintion() method of migration.js (e.g. [["firstName", "ASC"],["lastName", "ASC"]]), so, given how Object.keys operates on arrays, Object.keys(si.keys) will be an array of the position indexes of each of the array property pairs (e.g. siKeys = [0,1]); however, when the comparison for matching existing database indexes is performed (on line 612 of migration.js):

siKeys.forEach(function(propName, iter) {                                
  identical = identical && self.column(model, propName) === i.keys[iter];
});                                                                      

you'll notice the elements of siKeys are expected to be the property names (propName) of the keys (e.g. instead of the array of position index values [0,1], siKeys is expected to be ["firstName", "lastName"]), so the database index being evaluated will never match to any definition on the Loopback model and, therefore, the index will be dropped and recreated.

It looks like the correct code for defining siKeys should be something like:

const siKeys = si.keys.map((key) => {
   return key[0]; // get the property name described by "key"
});

this will create the correct array of property names and allow them to be matched to the column names on the model.

bajtos commented 3 years ago

@ong-james thank you for a detailed bug report ❤️

I am not sure if anybody remembers why we have the current behavior of autoupdate you are observing, you are probably the most knowledgeable person now. Your description and proposal looks reasonable, would you like to submit a pull request to fix the problem?

See https://loopback.io/doc/en/contrib/code-contrib.html to get started.

Please include some tests to verify your fix and prevent regressions in the future, you can find existing tests here: https://github.com/strongloop/loopback-connector-postgresql/blob/master/test/postgresql.autoupdate.test.js This part of our codebase is not as clean as we would like it to be, so feel free to write your new tests in a cleaner way, no need to follow all (anti)patterns in the existing code. For starters, there is no need to use callbacks anymore, you can write your tests as async function and use await inside, e.g. ds.autoupdate(['ATable']).

ong-james commented 3 years ago

@bajtos Absolutely :+1: I'll update the code and submit a PR.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue has been closed due to continued inactivity. Thank you for your understanding. If you believe this to be in error, please contact one of the code owners, listed in the CODEOWNERS file at the top-level of this repository.