Closed caguero closed 1 year ago
changes look good. CI failures seem unrelated. Are the flakes or do they need to be addressed?
Probably unrelated with this PR but I'm compiling Citadel on Docker as I don't fully understand why that test is failing.
Merging #415 (c277564) into ign-transport8 (cb1c95b) will decrease coverage by
0.05%
. The diff coverage is88.10%
.:exclamation: Current head c277564 differs from pull request most recent head 5b671de. Consider uploading reports for the commit 5b671de to get more accurate results
@@ Coverage Diff @@
## ign-transport8 #415 +/- ##
==================================================
- Coverage 83.61% 83.56% -0.05%
==================================================
Files 51 51
Lines 5035 5039 +4
==================================================
+ Hits 4210 4211 +1
- Misses 825 828 +3
Impacted Files | Coverage Δ | |
---|---|---|
include/gz/transport/Packet.hh | 0.00% <0.00%> (ø) |
|
...og/include/gz/transport/log/detail/QueryOptions.hh | 100.00% <ø> (ø) |
|
log/src/Batch.cc | 89.28% <ø> (ø) |
|
log/src/Descriptor.cc | 89.65% <ø> (ø) |
|
log/src/Descriptor.hh | 100.00% <ø> (ø) |
|
log/src/Log.cc | 78.72% <ø> (ø) |
|
log/src/Message.cc | 100.00% <ø> (ø) |
|
log/src/MsgIter.cc | 80.43% <ø> (ø) |
|
log/src/QualifiedTime.cc | 97.81% <ø> (ø) |
|
log/src/QueryOptions.cc | 100.00% <ø> (ø) |
|
... and 40 more |
I couldn't reproduce the issue with that test locally, so I've disabled it in Citadel.
🦟 Bug fix
Fixes #400
Summary
Discovery periodically sends topic/service heartbeats to keep all the topics/services advertised by a given node alive in the network. These messages are received by other discovery processes that keep the topics/services up to date or remove them in the absence of heartbeat updates after some time.
In order to get the list of topics/services,
gz
needs to initialize the discovery. This means that discovery should wait some time to receive all the topics/services updates to create the topic/service list. The original intent was to wait two cycles (2 * heartbeatInterval) before we consider the discovery initialized. However the current code only waits one cycle, as we were using the amount of heartbeats sent as the condition. That's essentially one cycle, which wasn't the original intent. The simple fix wait two real heartbeat cycles now.How to test it?
You can reproduce the issue without this patch. First, run the simulation:
Then, check that the
gz service -l
is consistent.This command will exit when the output is different than the last command. It should fail after a few tries. Then, repeat the testing procedure with this patch. The
watch
call should never finish, meaning the output ofgz service -l
is consistent.Note: My plan is to merge this patch from Citadel, and then, forward port it.
Checklist
codecheck
passed (See contributing)Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining
Signed-off-by
messages.