awslabs / amazon-kinesis-video-streams-pic

Apache License 2.0
49 stars 47 forks source link

Multiple fixes and adaption for VS test explorer #255

Closed MushMal closed 3 months ago

MushMal commented 4 months ago

Issue #, if available: None

What was changed?

Why was it changed?

How was it changed?

What testing was done for the changes?

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

MushMal commented 4 months ago

I see the PR Description Check failed but I am at a loss as to what exactly needs to be provided there and why this was flagged as "short". Help?

codecov-commenter commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 80.35%. Comparing base (e96735d) to head (97fb863).

:exclamation: Current head 97fb863 differs from pull request most recent head b69d1b9. Consider uploading reports for the commit b69d1b9 to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #255 +/- ## =========================================== - Coverage 80.36% 80.35% -0.01% =========================================== Files 53 53 Lines 10826 10823 -3 =========================================== - Hits 8700 8697 -3 Misses 2126 2126 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

hassanctech commented 4 months ago

I see the PR Description Check failed but I am at a loss as to what exactly needs to be provided there and why this was flagged as "short". Help?

I fixed it :). The issue was you were missing some italics :), unfortunately the template does an exact match of the text including the format -- easiest way to currently satisfy it is copy/paste the template from a previously passing PR and then fill in the details.

MushMal commented 4 months ago

Nice!

I looked into the TSAN that's failing it's happening across two different test runs.

The testThreadRoutine thread from the test ThreadCreateAndCancel is still running when the test for ThreadCreateAndReleaseSimpleCheckWithStack is kicked off. Due to the nature of the test itself we're not joining the thread. Since pthread_cancel returns right away we may avoid this by adding a sleep at the end of the test. Alternatively we could join the thread at the end of the function and make sure retval has PTHREAD_CANCELED .

I just looked at the test and there are few modifications I would suggest. First, if we are running 500 threads, having a mutex just to up the value might not be the best solution. An atomic counter would be the best way. But, that's not the main issue.

If I remember it correctly, GTest would create a new process to run each test so two separate tests would not conflict with each other even if they have threads with the same name. Mutex, however is another story. On Windows, mutexes are kernel objects and they will clash. That said, the process from the first test would terminate first before running the second test and it will close handle all kernel objects. This is again on the presumption that GTest is running tests sequentially. I am not sure how the test runs are configured in the jobs.

The biggest issue with the test is that it's very platform and time sensitive. TSAN will inevitably break or flag this test. I would recommend a separate PR to just fix this (and if there are other tests like this) test. The fix would be to just block indefinitely in the routine and perhaps move to use atomics. The main test body would then spin wait until the test count is reached (or use a semaphore) and would cancel the threads.

MushMal commented 4 months ago

Sorry, I missed the point about the TSAN failing. It's hard to reproduce this locally but let me try to modify these tests. Will provide another commit

MushMal commented 4 months ago

Hello, is there anything else that needs to be done for this PR? I am not sure how to restart the checks on this - I was under the assumption that it will be kicked off automagically with a new commit.

MushMal commented 4 months ago

@hassanctech, @disa6302 can you please look into this?

disa6302 commented 4 months ago

I started the run. Github has an approve and run workflow for forked PRs, that is why the CI did not start on its own.

MushMal commented 4 months ago

Looking at the TSAN failure (relevant part is attached). The TSAN output makes no sense to me. The times array is declared on LN 115

struct sleep_times st[TEST_THREAD_COUNT] = {0};

and is a local stack variable. This can not and should not interfere with the previous test run in LN 80. I must be missing something or not understanding how the tests are executed..???

2024-05-10T18:59:41.5607407Z ================== 2024-05-10T18:59:41.5607863Z WARNING: ThreadSanitizer: data race (pid=3692) 2024-05-10T18:59:41.5608534Z Write of size 8 at 0x7ffc92aa8260 by main thread: 2024-05-10T18:59:41.5609501Z #0 memset (kvspic_test+0xdd31d) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5629370Z #1 ThreadFunctionalityTest_ThreadCreateAndReleaseSimpleCheckWithStack_Test::TestBody() /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/tst/utils/Thread.cpp:115:24 (kvspic_test+0x32ca20) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5636229Z #2 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) (kvspic_test+0x44442a) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5638410Z #3 (libc.so.6+0x29d8f) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348) 2024-05-10T18:59:41.5639112Z 2024-05-10T18:59:41.5639433Z Previous read of size 8 at 0x7ffc92aa8260 by thread T2752: 2024-05-10T18:59:41.5641789Z #0 testThreadRoutine(void) /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/tst/utils/Thread.cpp:24:28 (kvspic_test+0x32b926) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5643540Z 2024-05-10T18:59:41.5643723Z Location is stack of main thread. 2024-05-10T18:59:41.5644087Z 2024-05-10T18:59:41.5644489Z Location is global '??' at 0x7ffc92a8b000 ([stack]+0x1d260) 2024-05-10T18:59:41.5644998Z 2024-05-10T18:59:41.5645301Z Thread T2752 (tid=6474, running) created by main thread at: 2024-05-10T18:59:41.5646393Z #0 pthread_create (kvspic_test+0xd33ad) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5649074Z #1 defaultCreateThreadWithParams /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/src/utils/src/Thread.c:186:14 (kvspic_test+0x3e7571) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5652264Z #2 defaultCreateThread /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/src/utils/src/Thread.c:225:5 (kvspic_test+0x3e7571) 2024-05-10T18:59:41.5655919Z #3 ThreadFunctionalityTest_ThreadCreateAndCancel_Test::TestBody() /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/tst/utils/Thread.cpp:80:9 (kvspic_test+0x32c1be) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5659424Z #4 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const*) (kvspic_test+0x44442a) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5661501Z #5 (libc.so.6+0x29d8f) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348) 2024-05-10T18:59:41.5662200Z 2024-05-10T18:59:41.5663967Z SUMMARY: ThreadSanitizer: data race (/home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/build/tst/kvspic_test+0xdd31d) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) in memset 2024-05-10T18:59:41.5665788Z ==================

MushMal commented 3 months ago

Can I request a new run with the latest?

MushMal commented 3 months ago

The run is failing for some odd reason. The format validator is failing on the following file src/mkvgen/include/com/amazonaws/kinesis/video/mkvgen/Include.h which is not part of the changes.

The mac gcc is still running and I don't see the logs to understand what's wrong. It has been running for 40 mins and I think something is wrong with the run itself.

hassanctech commented 3 months ago

Apologies for the delay here, looks great @MushMal!

Since we're making fixes for Windows I see this in the MSVC CI run logs:

[  0%] Building C object CMakeFiles/kvspic.dir/src/client/src/AckParser.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
AckParser.c
[  1%] Building C object CMakeFiles/kvspic.dir/src/client/src/AuthIntegration.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
AuthIntegration.c
[  1%] Building C object CMakeFiles/kvspic.dir/src/client/src/Callbacks.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
Callbacks.c
[  2%] Building C object CMakeFiles/kvspic.dir/src/client/src/Client.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
Client.c
D:\a\amazon-kinesis-video-streams-pic\amazon-kinesis-video-streams-pic\src\client\src\Client.c(115): warning C4102: 'CleanUp': unreferenced label
[  2%] Building C object CMakeFiles/kvspic.dir/src/client/src/ClientEvent.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'

Seems like Windows doesn't like -fPIC? Anything we can do to get rid of that for Windows, like maybe don't add -fPIC for Windows? I'll approve this as-is since this issue doesn't appear to be newly introduced.

MushMal commented 3 months ago

Apologies for the delay here, looks great @MushMal!

Since we're making fixes for Windows I see this in the MSVC CI run logs:

[  0%] Building C object CMakeFiles/kvspic.dir/src/client/src/AckParser.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
AckParser.c
[  1%] Building C object CMakeFiles/kvspic.dir/src/client/src/AuthIntegration.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
AuthIntegration.c
[  1%] Building C object CMakeFiles/kvspic.dir/src/client/src/Callbacks.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
Callbacks.c
[  2%] Building C object CMakeFiles/kvspic.dir/src/client/src/Client.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
Client.c
D:\a\amazon-kinesis-video-streams-pic\amazon-kinesis-video-streams-pic\src\client\src\Client.c(115): warning C4102: 'CleanUp': unreferenced label
[  2%] Building C object CMakeFiles/kvspic.dir/src/client/src/ClientEvent.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'

Seems like Windows doesn't like -fPIC? Anything we can do to get rid of that for Windows, like maybe don't add -fPIC for Windows? I'll approve this as-is since this issue doesn't appear to be newly introduced.

There are couple of other warnings I looked at but didn't know what to do with. As these are not actual hinderances to Windows and specially VS development, I would rather prefer getting the fixes in first and perhaps iterate on those later. I plan to enhance the code in the near future so might look into those too.