Haufe-Lexware / wicked.haufe.io

An API Management system based on Mashape Kong
http://wicked.haufe.io
Other
123 stars 37 forks source link

Two instances of wicked using same database are killing each other #190

Closed fromanu closed 4 years ago

fromanu commented 5 years ago

When having 2 k8s clusters with 2 instances of wicked deployed inside, which are using the same configuration and are sharing the same Postgres database (Azure Database for PostgreSQL), the 2 wicked instances are killing each other. The following messages appear in the logs of the api pods from the 2 clusters and they are continuously failing and restarting:

info: [+1176ms] portal-api:dao:pg:verifications Running verification record reconciliation.
warn: [+ 7ms] portal-api:principal Detected an updated config hash in the database, exiting: Voz5zf7eQY1Tm4I1pTvBefiCbL4= !== RxSssgtjsX6ipW77o6kR2W3fhxk=
error: [+ 0ms] portal-api:principal Force Quit API 
DonMartin76 commented 5 years ago

This happens for the following reason: wicked calculates a configuration "hash" over the static configuration; this hashing mechanism also included the .git folder, which in turns contains two files index and HEAD which are always written with dynamic content when a git clone is done.

You can call this a regression bug; this happened when changing the hashing calculation for wicked 1.0.0-rc.1 to use a node internal hashing instead of relying on shasum inside the API container.

DonMartin76 commented 5 years ago

Some edge cases are unfortunately still not working.

DonMartin76 commented 5 years ago

It works.

DonMartin76 commented 4 years ago

More edge cases not working: If there is a configuration upgrade step between versions, competing wicked versions will continuously restart each other, as the configuration has is calculated on the upgraded configuration, not on the original content from e.g. the git repository. The configuration has to be calculated before the configuration is upgraded.