TritonDataCenter / smartos-live

For more information, please see http://smartos.org/ For any questions that aren't answered there, please join the SmartOS discussion list: https://smartos.topicbox.com/groups/smartos-discuss
1.57k stars 244 forks source link

lx: add support for timers to target specific process/thread cpu clocks #989

Closed basvdlei closed 3 years ago

basvdlei commented 3 years ago

When using cpu time clocks on Linux, you can target the CLOCK_{THREAD,PROCESS}_CPUTIME_ID clock of a specific pid or thread. This is done in glibc by calling clock_getcpuclockid or pthread_getcpuclockid respectively. The pid/thread is added to the last 29 bits of the clockid, which the Linux timer syscalls understand.

Currently the lx timer syscall implementations (eg. clock_gettime and timer_create) will return an EINVAL when using these clockids, because it looks at all the bits, which does not match any of the backends. They should only look at the first 3 bits to determine the clock backend type and the remainder to find the corresponding process/thread.

There is support for CLOCK_{THREAD,PROCESS}_CPUTIME_ID for the calling process/thread, so hopefully it should be doable to get them for another process/thread as well.

A notable real world example is haproxy, where it's used for watchdog functionality. Haproxy < 2.2 will fail to start, while later versions will disable the watchdog.

A simple testcase is the example code from the pthread_getcpuclockid man page:

Gives the output on lx:

Subthread starting infinite loop
Main thread sleeping
Main thread consuming some CPU time...
Process total CPU time:    3.025
clock_gettime: Invalid argument
Main thread CPU time:   

Expected output:

Main thread sleeping
Subthread starting infinite loop
Main thread consuming some CPU time...
Process total CPU time:    2.406
Main thread CPU time:      0.706
Subthread CPU time: 1       1.699

dtrace shows the EINVAL error returned for the thread specific clockid:

# dtrace -n '*::clock_gettime:entry{printf("%d", arg0)}' \
              -n '*::clock_gettime:return{printf("%d", errno)}'
CPU     ID                    FUNCTION:NAME
  3   6903              clock_gettime:entry 2
  3   6904             clock_gettime:return 0
  3   6903              clock_gettime:entry 4294871934
  3   6904             clock_gettime:return 22

In addition the man page for pthread_getcpuclockid notes:

When thread refers to the calling thread, this function returns an identifier that refers to the same clock manipulated by clock_gettime(2) and clock_settime(2) when given the clock ID CLOCK_THREAD_CPUTIME_ID.

Which is clearly not the case at the moment, but should be kept in mind for the implementation.

Tested on platform: SmartOS (build: 20210422T002312Z) with lx dataset 0bf06d4d-b62f-4b3b-b560-3cd258df2070 (ubuntu-20.04)

danmcd commented 3 years ago

https://smartos.org/bugview/OS-5804

danmcd commented 3 years ago

I'm closing this issue because:

1.) It actually belongs on illumos-joyent, not smartos-live.

2.) Interested parties should follow OS-5804 (see link above).