GSA / data.gov

Main repository for the data.gov service
https://data.gov
Other
601 stars 95 forks source link

Error 1049: Unknown database 'servicebroker' #3777

Closed nickumia-reisys closed 2 years ago

nickumia-reisys commented 2 years ago

cloud.gov datagov-ssb @ b12dd98da5f659e3ee275ecd288b738cc61d6663

How it was discovered

  1. cf t -s managment
  2. (ssb-solrcloud) was running (we've made changes to ssb-solrcloud-k8s multiple times successfully over the last few days)
  3. Attempted to swap EKS backend
    • cf ds ssb-solrcloud-k8s
    • cf create-user-provided-service ssb-solrcloud-k8s -p k8s-xxx.json
    • cf bs ssb-solrcloud ssb-solrcloud-k8s
    • cf rs ssb-solrcloud
  4. Checked the logs (cf logs ssb-solrcloud --recent)
  5. Noticed this in the logs repeatedly,
    2022-04-13T10:38:01.96-0400 [APP/PROC/WEB/0] ERR {"timestamp":"1649860681.965931177","source":"cloud-service-broker","message":"cloud-service-broker.Database Setup","log_level":2,"data":{"error":"Error 1049: Unknown database 'servicebroker'"}}
    2022-04-13T10:38:01.96-0400 [APP/PROC/WEB/0] OUT [error] failed to initialize database, got error Error 1049: Unknown database 'servicebroker'
    2022-04-13T10:38:01.96-0400 [APP/PROC/WEB/0] OUT {"timestamp":"1649860681.965931177","source":"cloud-service-broker","message":"cloud-service-broker.Database Setup","log_level":2,"data":{"error":"Error 1049: Unknown database 'servicebroker'"}}
    2022-04-13T10:38:01.97-0400 [APP/PROC/WEB/0] OUT Exit status 1

Expected behavior

Upon restart, ssb-solrcloud should be working again.

Actual behavior

DB somehow broke.

Sketch

Not sure.

nickumia-reisys commented 2 years ago

I tried to force the SSB to change the DB by renaming it before our recent deployment. But it didn't work. The terraform plan just renamed the old DB to the correct name because it had the same ID.

If after the deployment, the issue persists, we will have to do some diagnosis on the DB or just delete it and recreate it to see if the error comes back.

Edit: Delete the DB with the caveat that the broker will no longer be able to manage any of solr clusters that were in the DB.

nickumia-reisys commented 2 years ago

The Problem

After investigation of the cloud-service-broker configuration for ssb-solrcloud, it seems like DB_HOST, DB_USERNAME and DB_PASSWORD were being set, but not DB_NAME. Because of this, the DB_NAME was defaulting to servicebroker which was the wrong name for the database in question.

The Solution

Those environment variables should not need to be set (as they are not set in the ssb-solr, ssb-smtp and ssb-eks brokers) so we removed them from the User-provided environment variables and restaged the ssb-solrcloud broker. This was able to fix the issue. However, we're not sure how this happened and are not sure this won't happen again.