Closed fimanishi closed 1 week ago
Attention: Patch coverage is 80.00000%
with 4 lines
in your changes missing coverage. Please review.
Project coverage is 72.65%. Comparing base (
91c09ef
) to head (9040f45
). Report is 2 commits behind head on master.:exclamation: Current head 9040f45 differs from pull request most recent head 87ee3e4
Please upload reports for the commit 87ee3e4 to get more accurate results.
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
common/domain/handler.go | 26 | 30 | 86.67% | ||
common/constants.go | 0 | 8 | 0.0% | ||
<!-- | Total: | 27 | 39 | 69.23% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
common/task/weighted_round_robin_task_scheduler.go | 2 | 89.05% | ||
service/matching/tasklist/task_list_manager.go | 2 | 77.05% | ||
common/persistence/historyManager.go | 2 | 66.67% | ||
service/history/task/task.go | 3 | 84.81% | ||
service/history/task/timer_standby_task_executor.go | 3 | 85.63% | ||
service/history/task/transfer_active_task_executor.go | 4 | 72.77% | ||
service/history/execution/cache.go | 6 | 74.61% | ||
service/history/task/task_util.go | 24 | 69.43% | ||
<!-- | Total: | 46 | --> |
Totals | |
---|---|
Change from base Build 01902d95-5925-47f2-80d9-47cee745b657: | -0.001% |
Covered Lines: | 106672 |
Relevant Lines: | 149198 |
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
common/domain/handler.go | 26 | 30 | 86.67% | ||
common/constants.go | 0 | 8 | 0.0% | ||
<!-- | Total: | 27 | 39 | 69.23% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
service/history/queue/timer_queue_processor_base.go | 1 | 78.28% | ||
common/task/weighted_round_robin_task_scheduler.go | 2 | 89.05% | ||
common/task/parallel_task_processor.go | 2 | 93.06% | ||
service/matching/tasklist/task_list_manager.go | 2 | 76.65% | ||
service/matching/tasklist/matcher.go | 2 | 90.91% | ||
common/persistence/historyManager.go | 2 | 66.67% | ||
common/persistence/statsComputer.go | 3 | 98.21% | ||
service/history/task/fetcher.go | 3 | 86.6% | ||
common/types/history.go | 4 | 45.35% | ||
common/task/fifo_task_scheduler.go | 4 | 83.51% | ||
<!-- | Total: | 30 | --> |
Totals | |
---|---|
Change from base Build 0190377e-f43d-435a-8a67-a08ecb9832b7: | 0.02% |
Covered Lines: | 106707 |
Relevant Lines: | 149198 |
It looks ok, but do we have a project on audit logging all domain changes? This is definitely a good enough solution to cover some simple questions about when the domain failed over, but I'm not sure if it is the only question that we have about domain changes.
I don't know if I responded to this question elsewhere, so adding here as well: I 100% agree, this is not at all a serious attempt to to track and audit changes, configuration changes, security-relevant changes and similar. I think there's a very good argument that could be made for creating such a stream - for things like this - for access logs, for long-term tracking and similar.
Such an architecture probably should be different, I can't imagine it'd be something we really want to leave in CAAS, it's much better suited to be just another stream in Kafka and shoved in hive for a while or something likely.
however, such an event log is not currently the main focus of what we're working on. I think it's a good idea and that there's probably value in us doing it, particularly as we focus more on the authorization-by-default components.
This tiny user-facing log of events is just enough to get us over the line for the failover project, and it shouldn't preclude us to doing a more serious event-log in the future.
Persist failover history in DomainInfo data
What changed? Added functionality to persist recent failover event data in the DomainInfo whenever a valid failover is executed.
FailoverEvent
contains the failovertimestamp
,fromCluster
,toCluster
, andFailoverType
("Force"/"Grace") information.FailoverHistory
is the key in theDomainInfo
data
. It's a slice stored as a string containing theFailoverEvents
, with max size defined bydynamicconfig.FrontendFailoverHistoryMaxSize
. It has a default value of 5 and domain filter allowed.FailoverHistory
always keep the n most recentFailoverEvents
and it's sorted by descending timestamp.Why? Persist failover information, improving failover visibility to clients and the cadence team.
How did you test it? Unit tests and integration tests. Tested locally, triggering failovers in a multiple Cadence clusters with replication environment.
Potential risks The change does not affect the logic of UpdateDomain. It adds the failover info to the DomainInfo data. The main risk is that we introduce something that can cause the code to panic. Errors while adding the FailoverHistory are logged as warnings and do not return/interrupt the UpdateDomain action.
Release notes
Documentation Changes