eclipse-cyclonedds / cyclonedds

Eclipse Cyclone DDS project
https://projects.eclipse.org/projects/iot.cyclonedds
Other
799 stars 349 forks source link

Event thread is freezed when time is changed to past #1995

Open silvara opened 1 month ago

silvara commented 1 month ago

When the time is changed to past (we changed time to 2 years ago), the event thread is freezed on xevent_thread - ddsrt_cond_waitfor. Because of this, the discovery is disconnected due to the participant lease duration after changing time to past.

Below is my temporary solution to fix it.

The ddsrt_cond_waitfor function is not implemented as monotonic on src/ddsrt/src/sync/posix/sync.c, it just calculates dds_time() + reltime and calls ddsrt_cond_waituntil.

The monotonic ddsrt_cond_waitfor can be implemented like below. (However, this is only implemented using pthread, and if you want to be compatible with other code, you should use other code. Like adding ddsrt_cond_init with is_monotonic parameter or ddsrt_cond_init_w_monotonic and so on.)

diff --git a/src/core/ddsi/src/q_xevent.c b/src/core/ddsi/src/q_xevent.c
index 9f8a79a2..fc3d6646 100644
--- a/src/core/ddsi/src/q_xevent.c
+++ b/src/core/ddsi/src/q_xevent.c
@@ -11,6 +11,7 @@
  */
 #include <math.h>
 #include <stdlib.h>
+#include <pthread.h>

 #include "dds/ddsrt/atomics.h"
 #include "dds/ddsrt/heap.h"
@@ -522,7 +523,13 @@ struct xeventq * xeventq_new (struct ddsi_domaingv *gv, size_t max_queued_rexmit
   evq->queued_rexmit_msgs = 0;
   evq->gv = gv;
   ddsrt_mutex_init (&evq->lock);
-  ddsrt_cond_init (&evq->cond);
+  // ddsrt_cond_init (&evq->cond);
+
+  pthread_condattr_t attr;
+  pthread_condattr_init(&attr);
+  pthread_condattr_setclock(&attr, CLOCK_MONOTONIC);
+  pthread_cond_init(&evq->cond.cond, &attr);
+  evq->cond.is_monotonic = true;

   evq->cum_rexmit_bytes = 0;
   return evq;
diff --git a/src/ddsrt/include/dds/ddsrt/sync/posix.h b/src/ddsrt/include/dds/ddsrt/sync/posix.h
index c13d8e7d..033b4b78 100644
--- a/src/ddsrt/include/dds/ddsrt/sync/posix.h
+++ b/src/ddsrt/include/dds/ddsrt/sync/posix.h
@@ -24,6 +24,7 @@ extern "C" {

 typedef struct {
     pthread_cond_t cond;
+    bool is_monotonic;
 } ddsrt_cond_t;

 typedef struct {
diff --git a/src/ddsrt/src/sync/posix/sync.c b/src/ddsrt/src/sync/posix/sync.c
index 89dce5d9..3ffbebe8 100644
--- a/src/ddsrt/src/sync/posix/sync.c
+++ b/src/ddsrt/src/sync/posix/sync.c
@@ -131,8 +131,16 @@ ddsrt_cond_waitfor(
   assert(cond != NULL);
   assert(mutex != NULL);

+  int64_t base_time;
+  if (cond->is_monotonic) {
+      ddsrt_mtime_t time =  ddsrt_time_monotonic();
+      base_time = time.v;
+  } else {
+      base_time = dds_time();
+  }
+
   return ddsrt_cond_waituntil(
-    cond, mutex, ddsrt_time_add_duration(dds_time(), reltime));
+    cond, mutex, ddsrt_time_add_duration(base_time, reltime));
 }

 void
Gummum commented 1 month ago

I may also find this problem, when I change the time back three minutes, I will lose the three minutes of data, thank you for fixing.