ROCm / MIOpen

AMD's Machine Intelligence Library
https://rocm.docs.amd.com/projects/MIOpen/en/latest/
Other
1.05k stars 219 forks source link

GTest improvements #3140

Open CAHEK7 opened 1 month ago

CAHEK7 commented 1 month ago

It's an umbrella ticket for better tracking of GTest improvement activity ordered by priority.

The tickets under the different checkboxes are independent and can be done in-parallel, while tickets under the same checkbox must be done sequential and in a predefined order.

junliume commented 1 month ago

Recommend to add to the list of investigation:

why skipping a test is taking so long? It does not make sense to spend 3 seconds just to skip a test IMHO

379/8695 Test  #381: Full/GPU_Adam_FP16.AdamFloat16TestFw/adam_w input:1 lr:0.001 beta1:0.9 beta2:0.999 weight_decay:0 eps:1e-06 amsgrad:0 maximize:1 ..................................................................................................................................................................................................................................................................................................***Skipped   3.05 sec
CAHEK7 commented 1 month ago

Recommend to add to the list of investigation:

why skipping a test is taking so long? It does not make sense to spend 3 seconds just to skip a test IMHO

379/8695 Test  #381: Full/GPU_Adam_FP16.AdamFloat16TestFw/adam_w input:1 lr:0.001 beta1:0.9 beta2:0.999 weight_decay:0 eps:1e-06 amsgrad:0 maximize:1 ..................................................................................................................................................................................................................................................................................................***Skipped   3.05 sec

@junliume because they initialize the buffers and skip the test. We have A LOT of similar issues.

It is already described in https://github.com/ROCm/MIOpen/wiki/GTest-development#early-skip

Exit from the test as early as possible. Any test skipping functionality must reside in void SetUp() override. For example, right now 198 AddLayerNorm tests take around 20s to skip, but if we move skip routine at the beginning of the SetUp, it takes 1ms

It happened because initially we've added this evil patter to all GTESTS and even for our "smoke" tests, we initialize ALL the buffers for most of the tests disabled by MIOPEN_TEST_ALL or restricted by the particular data type, because most of the tests check MIOPEN_TEST_ALL and MIOPEN_FLOAT inside the test body right after all the initialization.

junliume commented 1 month ago

Thank you for the detailed explanation @CAHEK7

we've added this evil patter to all GTESTS and even for our "smoke" tests

Let's discuss further on ways to resolve these issues :)