Open vivek0079 opened 1 year ago
I'm facing this issue:
{"level":"error","ts":"2024-04-19T07:05:05.519Z","msg":"Error refreshing domain cache","service":"cadence-matching","error":"ListDomains timed out. Failed to get domain rows. Error: context deadline exceeded","logging-call-at":"domainCache.go:425","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:131\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:425"}
Did someone find anything related to this?
Looks like the query is failing on the storage layer. How is your Cassandra (or the storage you are using) metrics looking? You might need to scale up or out your storage.
Apart from this, just to check if your storage is running at all, are you able run workflows?
Version of Cadence server, and client(which language) This is very important to root cause bugs.
Describe the bug Cadence server is not able to refresh the Domain cache when the Cassandra domain changes
To Reproduce Is the issue reproducible?
Steps to reproduce the behaviour:
{"level":"error","msg":"Error refreshing domain cache","service":"cadence-frontend","error":"gocql: no hosts available in the pool","logging-call-at":"domainCache.go:401","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:134\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:401"}
Expected behaviour
Screenshots Logs -
{"level":"error","ts":"2022-08-09T10:06:26.332Z","msg":"Operation failed with internal error.","service":"cadence-frontend","error":"gocql: no hosts available in the pool","metric-scope":42,"logging-call-at":"persistenceMetricClients.go:812","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:134\ngithub.com/uber/cadence/common/persistence.(*metadataPersistenceClient).updateErrorMetric\n\t/cadence/common/persistence/persistenceMetricClients.go:812\ngithub.com/uber/cadence/common/persistence.(*metadataPersistenceClient).GetMetadata\n\t/cadence/common/persistence/persistenceMetricClients.go:790\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshDomainsLocked\n\t/cadence/common/cache/domainCache.go:425\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshDomains\n\t/cadence/common/cache/domainCache.go:412\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:396"}
{"level":"error","ts":"2022-08-09T10:06:26.332Z","msg":"Error refreshing domain cache","service":"cadence-frontend","error":"gocql: no hosts available in the pool","logging-call-at":"domainCache.go:401","stacktrace":"github.com/uber/cadence/common/log/loggerimpl.(*loggerImpl).Error\n\t/cadence/common/log/loggerimpl/logger.go:134\ngithub.com/uber/cadence/common/cache.(*domainCache).refreshLoop\n\t/cadence/common/cache/domainCache.go:401"}
Additional context Add any other context about the problem here, E.g. Stackstace, workflow history.