nats-io / nats.c

A C client for NATS
Apache License 2.0
382 stars 132 forks source link

[FIXED] cleaning up sanitize=thread found several races #771

Closed levb closed 1 month ago

levb commented 1 month ago

note: I've been working on a bigger PR for async fetch (aka Consume), it is not ready yet but since it will be a big PR I'll try to send in parts that are ready to simplify the review. This is part 1.

Before this PR warnings generated by-fsanitize=thread in CI were ignored, since the detailed test logs were dropped and not scanned. This PR adds the log scanning.

It also uncovered several issues with the existing tests, all of them but one were relatively trivial. In Test_ForceReconnect detected a race caused by us forecefully close-ing an fd which is still being read on by the readloop. I could not think of a simple solution that would not have performance implications, so I disabled the test for thread sanitizer. The code seems to be doing the right thing anyway, and the readloop thread would error out, as designed.

The following tests were fixed:

This PR also adds a few more CI jobs, including (experimentally for now) DEV_MODE. I will remove the DEV_MODE jobs if they prove flaky and not easily fixable. Same for the (added) debug/sanitize jobs, will monitor and remove/fix if too slow/flaky.

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 28.57143% with 15 lines in your changes missing coverage. Please review.

Project coverage is 68.79%. Comparing base (1553d4a) to head (bf65830). Report is 2 commits behind head on main.

Files Patch % Lines
src/js.c 20.00% 3 Missing and 5 partials :warning:
src/conn.c 36.36% 3 Missing and 4 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #771 +/- ## ========================================== + Coverage 68.71% 68.79% +0.08% ========================================== Files 39 39 Lines 15207 15227 +20 Branches 3143 3153 +10 ========================================== + Hits 10449 10476 +27 + Misses 1700 1684 -16 - Partials 3058 3067 +9 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

levb commented 1 month ago

@kozlovic The travis build still does sanitize=thread for clang. I tried to enable gcc, and got the same cmake configuration-time failure, so disabled it back. I added a fuller matrix of sanitize to GH Actions, will see what it produces.

levb commented 1 month ago

@kozlovic The fuller matrix found a couple more, in the code, missing libDlvWorker lock; please check.