Aiven-Open / pghoard

PostgreSQL® backup and restore service
http://aiven-open.github.io/pghoard/
Apache License 2.0
1.32k stars 97 forks source link

Ensure receivexlog respects new configuration values after SIGHUP #530

Closed nicois closed 1 year ago

nicois commented 2 years ago

This modifies the receivexlog to use the updated configuration settings following a SIGHUP

Why this way

When SIGHUP is issued, self.config is refreshed in all threads, but without this change, the new values are disregarded. This change ensures the updated config object is always used.

nicois commented 2 years ago

@alanfranz-at-work the tests are failing even when run against a "null" change: that is, my original intended commit, along with a revert commit. It appears there is a problem with version pinning in the tests, leading to these failures. I'll see if there is a simple fix to resolve this problem.

Eyeballing the code, the use of these value in this python script is thread-safe: these values are each used for stateless operations, primarily determining whether to stop or continue operation. I can add a couple of simple unit tests, one the existing tests are green again.

ghost commented 2 years ago

@alanfranz-at-work the tests are failing even when run against a "null" change: that is, my original intended commit, along with a revert commit. It appears there is a problem with version pinning in the tests, leading to these failures. I'll see if there is a simple fix to resolve this problem.

Checking this.

Eyeballing the code, the use of these value in this python script is thread-safe: these values are each used for stateless operations, primarily determining whether to stop or continue operation. I can add a couple of simple unit tests, one the existing tests are green again.

We cannot risk that a future change will fail and leave us with an inconsistent state. We should make sure that all config values are updated at the same time, and that they're not updated while being used.

Example:

        if now - self.last_disk_space_check < self.disk_space_check_interval:
            return

        bytes_free = self.get_disk_bytes_free()
        if not self.receiver_paused:
            if bytes_free < self.min_disk_space:

what happens if one section of the code runs with old version of disk_space_check_interval, and the other with an updated version of min_disk_space?

Possibly nothing, but the code is prone to failure.

codecov[bot] commented 2 years ago

Codecov Report

Merging #530 (bc42f6b) into main (2588f4b) will decrease coverage by 0.37%. The diff coverage is 75.00%.

:exclamation: Current head bc42f6b differs from pull request most recent head ccb6055. Consider uploading reports for the commit ccb6055 to get more accurate results

Additional details and impacted files [![Impacted file tree graph](https://codecov.io/gh/aiven/pghoard/pull/530/graphs/tree.svg?width=650&height=150&src=pr&token=nLr7M7hvCx&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven)](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven) ```diff @@ Coverage Diff @@ ## main #530 +/- ## ========================================== - Coverage 91.25% 90.87% -0.38% ========================================== Files 30 30 Lines 4607 4614 +7 ========================================== - Hits 4204 4193 -11 - Misses 403 421 +18 ``` | [Impacted Files](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven) | Coverage Δ | | |---|---|---| | [pghoard/pghoard.py](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven#diff-cGdob2FyZC9wZ2hvYXJkLnB5) | `84.14% <0.00%> (-0.60%)` | :arrow_down: | | [pghoard/receivexlog.py](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven#diff-cGdob2FyZC9yZWNlaXZleGxvZy5weQ==) | `92.92% <85.71%> (-2.20%)` | :arrow_down: | | [pghoard/basebackup/base.py](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven#diff-cGdob2FyZC9iYXNlYmFja3VwL2Jhc2UucHk=) | `85.58% <0.00%> (-6.62%)` | :arrow_down: | | [pghoard/walreceiver.py](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven#diff-cGdob2FyZC93YWxyZWNlaXZlci5weQ==) | `97.85% <0.00%> (-0.28%)` | :arrow_down: | | [pghoard/compressor.py](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven#diff-cGdob2FyZC9jb21wcmVzc29yLnB5) | `93.07% <0.00%> (-0.06%)` | :arrow_down: | | [pghoard/wal.py](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven#diff-cGdob2FyZC93YWwucHk=) | `100.00% <0.00%> (ø)` | | | [pghoard/basebackup/delta.py](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven#diff-cGdob2FyZC9iYXNlYmFja3VwL2RlbHRhLnB5) | `95.86% <0.00%> (ø)` | | | [pghoard/restore.py](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven#diff-cGdob2FyZC9yZXN0b3JlLnB5) | `90.85% <0.00%> (+0.24%)` | :arrow_up: | | [pghoard/webserver.py](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven#diff-cGdob2FyZC93ZWJzZXJ2ZXIucHk=) | `89.36% <0.00%> (+0.39%)` | :arrow_up: | | ... and [1 more](https://codecov.io/gh/aiven/pghoard/pull/530?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=aiven) | |
nicois commented 2 years ago

what happens if one section of the code runs with old version of disk_space_check_interval, and the other with an updated version of min_disk_space?

Possibly nothing, but the code is prone to failure.

I've updated the code so new configuration changes are only assimilated at a specific point, meaning during each cycle the configuration values will remain consistent.

ghost commented 2 years ago

@nicois please remember to re-request review after addressing comments, otherwise this can slip out of our view.

Please fix conflicts, then we'll make sure to have this reviewed.