Open anupambhatnagar opened 2 years ago
also layertracker test seems to be flaky
tests.experimental.tooling.test_layer_memory_tracker
> assert summary.total_forward_allocations >= summary.total_activation_allocations
E assert 77056000 >= 77070864
E + where 77056000 = LayerwiseMemoryTrackerSummary(max_memory_allocated=104022528, max_memory_cached=134217728, total_activation_allocation... is_forward=True, all_gathered=0, cumul_all_gathered=0, event=TraceForwardEvent(memory_diff=0, memory_activations=0))]).total_forward_allocations
E + and 77070864 = LayerwiseMemoryTrackerSummary(max_memory_allocated=104022528, max_memory_cached=134217728, total_activation_allocation... is_forward=True, all_gathered=0, cumul_all_gathered=0, event=TraceForwardEvent(memory_diff=0, memory_activations=0))]).total_activation_allocations
tests/experimental/tooling/test_layer_memory_tracker.py:65: AssertionError
Another flaky test for the list: tests/nn/data_parallel/test_fsdp.py: TestSerialization
Remember to address todo in https://github.com/facebookresearch/fairscale/pull/933/files/86d45de0e14c8273916c3c68db8c297bc3bb59a8#diff-bc534f971a86e11cc30be248c89989c8024c941a855d4716025da461fcd29047
thanks to Anjali for suggesting recording this.
Here is a list of flaky tests that we should fix in our next fix-a-thon.