neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
14.78k stars 430 forks source link

tests: tenant attach and lack of polling of `attachment_status` #5774

Closed koivunej closed 6 months ago

koivunej commented 11 months ago

Test failure after attach; it's a real_s3 with simulate_failures=1 but looking around that test[^1] and our test suite, none of the tenant attaches have polling of attachment_status, some of the tests (not all) are polling until tenant is active.

Related: #5698 (recently merged) Related: https://neondb.slack.com/archives/C0634NXQ6E7/p1698419362155059 (#investigation-2023-10-27-stuck-project...)

Instead of jumping into implementing this, is there are a reason why we are not dogfooding the attachment_status? I think a clear solution would be to do the things we list in openapi.yml, in this case, poll until attachment_status == "attached", otherwise we risk drift between "some other check" vs. what we expect the control plane to do, and clean up all "tenant is active" polling.

[^1]: The failing assert, logs show metrics reading before attaching completes

problame commented 11 months ago

This was discussed on Slack: https://neondb.slack.com/archives/C033RQ5SPDH/p1699026222461769?thread_ts=1698950964.766459&cid=C033RQ5SPDH

tl;dr:

I.e., attachment_status was never intended as as general readiness field, but only as a field to indicate progress of the /attach call.

Instead of jumping into implementing this, is there are a reason why we are not dogfooding the attachment_status? I think a clear solution would be to do the things we list in openapi.yml, in this case, poll until attachment_status == "attached", otherwise we risk drift between "some other check" vs. what we expect the control plane to do, and clean up all "tenant is active" polling.

Yeah, we should have been polling for attachment status in the tests. That'd be better, than, say, polling for state = Active. Then again, the /attach API will be gone soon, so, I don't think it's worth the time to fix up the tests.

koivunej commented 8 months ago

This is probably irrelevant now with the generation service.