Open glommer opened 6 years ago
We could avoid penalizing by switching to thread cpu-time for accounting, however those (IIRC) involve a system call and don't have a vdso implementation (it should be easy to add one though, at least technically).
Ok, maybe not easy. But possible.
I've been taking a look at what would it take to make that available through vdso.
It seems a bit challenging to me:
When we call into vdso we are guaranteed to the the current task. CLOCK_THREAD_CPUTIME_ID
boils down after a bunch of indirection to task_sched_runtime
, which, for the case of the current task, does:
if (task_current(rq, p) && task_on_rq_queued(p)) {
prefetch_curr_exec_start(p);
update_rq_clock(rq);
p->sched_class->update_curr(rq);
}
We cannot refrain from the update totally, because by doing that we can be foregoing the whole time this process has been in the CPU (which can be many milliseconds).
We can try to expose the last reading and the current stamp of rq->clock
into the mapped page and try to calculate the current delta in userspace.
except vdso struct is global, there is nothing there where I think we can add a per-process identifier of any kind. Per-CPU doesn't look like it would work either, otherwise we'd be exposing information about run time for unrelated processes, which looks like a security breach.
If task timing info was in its own page, then the user could mmap it, and use a thread-local pointer to reference it.
Scylla doesn't like being preempted out because of our event-loop like architecture. We can detect whether or not something has run on our lieu and: