intel-staging / libptpmgmt_iaclocklib

The Clock Manager is a library for monitoring network time synchronization on the local platform. Disclaimer: This project is under development. All source code and features on the main branch is for the purpose of testing or evaluation and not production ready. We will upstream the code and archive this GitHub repo thereafter.
Other
4 stars 5 forks source link

jclklib: Fix client terminate under high load condition #94

Closed JunAnnLaiIntel closed 5 days ago

JunAnnLaiIntel commented 2 weeks ago

Steps to generate high load condition

  1. add summary_interval -7 and logSyncInterval -7 in config file to make the ptp4l send more offset value in every second image

  2. remove sleep(idle_time) in sample/jclk_test.cpp to remove the wait time for the sample application image

Tested with this patch and no client terminate issue observed under high load condition for long time. image

rupran commented 2 weeks ago

While this fixes the problem with logSyncInterval settings <= 0 and without the sleep() calls in the sample application, a client using the Clock Management API might also want to sleep for longer than one second between the jcl_status_wait calls and would then immediately receive timeouts even though the proxy is still alive.

As an example, you could use the following settings:

This leads to an immediate timeout ([jclklib][430847.137] Terminating: lost connection to jclklib Proxy) after waking up from the sleep() call.

Is this the expected behaviour or are there plans to allow arbitrary intervals between jcl_status_wait API calls in a generic way?

yoongsiang2 commented 1 week ago

While this fixes the problem with logSyncInterval settings <= 0 and without the sleep() calls in the sample application, a client using the Clock Management API might also want to sleep for longer than one second between the jcl_status_wait calls and would then immediately receive timeouts even though the proxy is still alive.

As an example, you could use the following settings:

  • logSyncInterval: 2 in both ptp4l configurations (GM and slave)
  • start the sample application with -i 2 (2 seconds of idle time)

This leads to an immediate timeout ([jclklib][430847.137] Terminating: lost connection to jclklib Proxy) after waking up from the sleep() call.

Is this the expected behaviour or are there plans to allow arbitrary intervals between jcl_status_wait API calls in a generic way?

Thank you for your feedback. Your concern is valid. We overlooked the possibility that users might set the clock synchronization interval to a very low value. We will discuss this and come up with a better design.