alan-turing-institute / data-safe-haven

https://data-safe-haven.readthedocs.io
BSD 3-Clause "New" or "Revised" License
60 stars 15 forks source link

Error when redeploying PostgreSQL Flexible Server #1816

Closed jemrobinson closed 3 months ago

jemrobinson commented 6 months ago

:white_check_mark: Checklist

:computer: System information

:no_entry_sign: Describe the problem

If a PostgreSQL Flexible Server is destroyed (perhaps during SRE teardown) it is impossible to ever create another PostgreSQL Flexible Server with the same name. This means that an SRE cannot be torn down and then redeployed. See this Terraform thread for further discussion.

A possible fix would be to generate a reproducibly-random alphanumeric string and append this to the database name.

:deciduous_tree: Log messages

Relevant log messages ```none 2024-04-16 13:19:00 [ ERROR] azure-native:dbforpostgresql:Server (sre_remote_desktop_db_guacamole_server): cli.py:99 2024-04-16 13:19:00 [ ERROR] error: resource partially created but read failed autorest/azure: Service returned an error. Status=404 Code="ResourceNotFound" Message="The requested resource of type 'Microsoft.DBforPostgreSQL/flexibleServers' cli.py:99 with name 'shm-green-sre-apple-db-server-guacamole' was not found.": Code="ServerGroupDropping" Message="Operations on a server group in dropping state are not allowed." 2024-04-16 13:19:00 [ ERROR] ```

:recycle: To reproduce

Deploy an SRE, tear it down then redeploy.

JimMadge commented 5 months ago

Could use a Pulumi random string which persists in the state.

JimMadge commented 5 months ago

@craddm found this problem 1-2 weeks ago.

jemrobinson commented 4 months ago

Is this still a problem? I haven't seen it recently.

craddm commented 4 months ago

It hasn't seemed to be a big problem lately - last couple of times I've seen it happen it has fixed itself very quickly. Probably okay to close this for now.

jemrobinson commented 4 months ago

I've moved it to the next milestone so we can decide again then!