microsoft / FluidFramework

Library for building distributed, real-time collaborative web applications
https://fluidframework.com
MIT License
4.7k stars 528 forks source link

[GC] Fixed flaky GC stats end-to-end test #21919

Closed agarwal-navin closed 1 month ago

agarwal-navin commented 1 month ago

Bug and root-cause

A GC test that validates deleted stats is flaky. The test uses sweep timeout of 200 ms. It does the following in sequence:

  1. Runs GC wtih every thing referenced and validates that there unreferenced stats are 0.
  2. Unreferences one data store and one blob, runs GC and validates they show as unreferenced.
  3. Unreferences another data store and another blob, runs GC and validates they show as unreferenced.
  4. Wait for sweep timeout, runs GC so that GC delete op is sent.
  5. Run GC again and validate that the unreferenced data stores and blobs show up as deleted.

If 200 ms have passed before step 3 is run, the unreferenced data store and blob will become sweep ready and will show up as deleted in GC stats. There was a change few months back that changed the logic to add sweep ready objects as deleted even when sweep is enabled (https://github.com/microsoft/FluidFramework/pull/19524) and the test probably became flaky then.

Fix

Broke the test above into 2:

  1. The first one only validates unreferenced stats (steps 1 to 3 above).
  2. The second one that only validates sweep ready and deleted stats. It unreferences objects, waits for sweep timeout so that nodes are sweep ready and validates that they show as deleted. It then runs GC again so that the GC sweep op has round tripped and these nodes are deleted. Since the test goes from unreferenced -> wait for sweep timeout -> validation, there is no intermediate GC run like bfore where it can unexpectedly become sweep ready.

AB#8350

msfluid-bot commented 1 month ago
@fluid-example/bundle-size-tests: +245 Bytes
Metric NameBaseline SizeCompare SizeSize Diff
aqueduct.js 457.6 KB 457.63 KB +35 Bytes
azureClient.js 555.31 KB 555.35 KB +49 Bytes
connectionState.js 680 Bytes 680 Bytes No change
containerRuntime.js 258.68 KB 258.7 KB +14 Bytes
fluidFramework.js 391.58 KB 391.59 KB +14 Bytes
loader.js 134.02 KB 134.03 KB +14 Bytes
map.js 42.17 KB 42.18 KB +7 Bytes
matrix.js 145.68 KB 145.69 KB +7 Bytes
odspClient.js 523.45 KB 523.5 KB +49 Bytes
odspDriver.js 97.55 KB 97.57 KB +21 Bytes
odspPrefetchSnapshot.js 42.61 KB 42.62 KB +14 Bytes
sharedString.js 162.69 KB 162.7 KB +7 Bytes
sharedTree.js 382.09 KB 382.1 KB +7 Bytes
Total Size 3.27 MB 3.27 MB +245 Bytes

Baseline commit: 3f00c6dbecaa614bae059dabd09f8e197b14dfdf

Generated by :no_entry_sign: dangerJS against 9f46ec7bcba366c8cf0bdf9d18fb3129a70320ec