neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
15.15k stars 442 forks source link

refactor(pageserver): check layer map valid in one place #9051

Closed skyzh closed 1 month ago

skyzh commented 2 months ago

Problem

We have 3 places where we implement layer map checks.

Summary of changes

Now we have a single check function being called in all places.

Checklist before requesting a review

Checklist before merging

github-actions[bot] commented 2 months ago

5020 tests run: 4856 passed, 0 failed, 164 skipped (full report)


Flaky tests (8) #### Postgres 17 - `test_pageserver_compaction_smoke`: [release-arm64](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9051/10947226909/index.html#suites/f08716c3eb261a35d11a79ed9535ded6/18240e6e58c497ff/retries) - `test_ondemand_wal_download_in_replication_slot_funcs`: [release-x86-64](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9051/10947226909/index.html#suites/180444c850d4a41d41eb0a410dc16d84/a15d7f525bcecdd1/retries) - `test_scrubber_physical_gc[4]`: [debug-x86-64](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9051/10947226909/index.html#suites/616e84f65c91fe4bc748db7447d35268/136c7c7b403cd6b7/retries) #### Postgres 16 - `test_slots_and_branching`: [release-x86-64](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9051/10947226909/index.html#suites/180444c850d4a41d41eb0a410dc16d84/9713a8e993a58242/retries) - `test_obsolete_slot_drop[cross-validation]`: [release-arm64](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9051/10947226909/index.html#suites/180444c850d4a41d41eb0a410dc16d84/5aa6775906f58a8e/retries) #### Postgres 15 - `test_layer_download_cancelled_by_config_location`: [release-arm64](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9051/10947226909/index.html#suites/b97efae3a617afb71cb8142f5afa5224/6a69b19d74d2fd63/retries) - `test_scrubber_physical_gc[4]`: [release-arm64](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9051/10947226909/index.html#suites/616e84f65c91fe4bc748db7447d35268/bbe982ae19534437/retries) - `test_subscriber_restart`: [release-x86-64](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9051/10947226909/index.html#suites/8be0c222d5601535470e7e5978bbfb03/9e07b2f8d0678a6f/retries)

Code coverage* (full report)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
f8d8f5b90fa487a9e82c42da223f012f5d4fece7 at 2024-09-19T20:39:10.603Z :recycle:
skyzh commented 1 month ago

I was a bit worried you were thinking of doing this with the prod code, approved with the added cfg(test).

The code is also used in storage-scrubber so it's not test only...

koivunej commented 1 month ago

test_layer_download_timeouted: release-x86-64

This looks surprising, I'll peek. You can just re-run failed or push it's unrelated, but I haven't seen this failure mode.