hashicorp / hcat

Hashicorp Configuration and Templating library (hcat, pronounced hashicat)
Mozilla Public License 2.0
95 stars 12 forks source link

Hcat does not reset index for blocking queries with Consul when Consul restarts #103

Closed wilkermichael closed 2 years ago

wilkermichael commented 2 years ago

Summary

When a Consul instance is restarted while Hcat is connected to it, when Hcat reconnects it can reuse the same index, causing the next query to block

Investigation

In view.go, when Hcat suspects Consul was restarted, it is supposed to reset the index: https://github.com/hashicorp/hcat/blob/4bf0597dd979d16decd303ab52bce5abeebec63e/view.go#L218-L227

This case is only covered when the expected string "connection refused" is present. While Consul is shutting down, it returns an error with grpc closing which does not match the expected string, and will not cause the index to be reset to 0. If Consul starts up again immediately, Hcat can then reconnect using the previous index, resulting in a blocking query on new changes.

Solution

Update https://github.com/hashicorp/hcat/blob/4bf0597dd979d16decd303ab52bce5abeebec63e/view.go#L218 to use the consul API StatusError:

type StatusError struct {
    Code int
    Body string
}

https://github.com/hashicorp/consul/blob/fed112e51ee38eee5eb7d7d46bf9b3dc308b70cf/api/api.go#L85-L88

Reset index on code 500 rather than on a particular string

wilkermichael commented 2 years ago

Using StatusError requires an upgrade to consul/api package