fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
3k stars 416 forks source link

Fix flaky `TestUpdateStatsOnReplica` #20046

Closed getvictor closed 2 months ago

getvictor commented 3 months ago

The replica DB test frequently fails

Last_SQL_Errno:1146 Last_SQL_Error:Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'ANONYMOUS' at master log bin.000003, end_log_pos 53420339. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.

It appears the replica DB may be not set up correctly.

Ideas:

sharon-fdm commented 3 months ago

Hey team! Please add your planning poker estimate with Zenhub @getvictor @jacobshandling @lucasmrod @mostlikelee @RachelElysia

roperzh commented 3 months ago

@getvictor I think I found a fix for this in https://github.com/fleetdm/fleet/pull/20121. Running integration tests in parallel (https://github.com/fleetdm/fleet/pull/20085) makes this consistently fail on MySQL 8, that's why I took a look.

I think the fix works because I pushed a branch with both the parallel runs + the fix and the test passes there.

Could you take a look at the PR?

fleet-release commented 2 months ago

Set up the replica, Peaceful as morning's first light. Tests run, flawless flight.