lsds / sgx-lkl

SGX-LKL Library OS for running Linux applications inside of Intel SGX enclaves
MIT License
254 stars 88 forks source link

Clock resolutions #576

Open wintersteiger opened 4 years ago

wintersteiger commented 4 years ago

The launcher currently gets an array of 8 clock resolution settings from the host and passes it to the enclave (here). If the enclave execution depends on those settings, they need to be attested, or be declared "shared memory" with an attested option to disable that feature.

@SeanTAllen: Do we still need all 8 of those settings or can we make do with one of them and compute the others in the enclave? Is there a way to get rid of all of them?

CC @prp @letmaik

SeanTAllen commented 4 years ago

@wintersteiger I don't know. I didn't touch anything related to how those clocks are provided. Only the source of nanos. Someone (could be be) would need to investigate how those resolutions are used in lkl.

wintersteiger commented 4 years ago

I see, no worries. I think they are essentially just reported to callers of clock_getres, but the clocks should of course provide the resolution that's claimed.

@prp I think we definitely don't want to claim something we can't provide. What should be the strategy for this - turn it into shared memory with no guarantees, remove it completely, or maybe just support CLOCK_MONOTONIC?

SeanTAllen commented 4 years ago

CLOCK_REALTIME is driven by that underlying source of nanos, same as CLOCK_MONOTONIC. So we definitely support both of those. @davidchisnall might know more about the others. I only looked at REALTIME and MONOTONIC as part of the work I did (writing tests to make sure that each worked as expected).

wintersteiger commented 4 years ago

Great, that's good to know! Is that source of nanos configurable in some way or does it depend on ocalls or similar? Since they are nanos, I guess the resolution is by definition also nanos (that's what my host claims too), so we wouldn't need an attested config setting for this.

From where I look at this, only the coarse clocks report a different resolution (1 ms), but it feels like we should be able to provide those clocks based on the nano-clocks too. (I have no idea how much work that would be though.)

There's a to-do asking for a config setting for the timer task (here). Is that indeed needed and could that config setting replace the current clock_res settings?

davidchisnall commented 4 years ago

All of the clocks are now maintained by LKL. The resolution is a bit fuzzy, it's somewhere between 1ns and 5ms (or worse if we're under attack). We probably should advertise it as 5ms. The old code pulling the clock resolutions from the host was used for the non-LKL clock implementation and is dead code now.

prp commented 4 years ago

Isn't the clock resolution of the host clock that drives the new timer a lower bound for the resolutions of the derived clocks inside of LKL?

wintersteiger commented 4 years ago

@davidchisnall that's great, that basically means we can remove that setting from the config and replace it with 5ms for all of them. The theoretical worst case would be higher I suppose, so the max for struct timespec would be another option (basically telling applications not to rely on it). Since the actual resolution is ultimately controlled by the host, do we need an attested setting that lets us disable the clock?

SeanTAllen commented 4 years ago

the clock inside the enclave is "driven" mostly by the external clock, updating every 5 ms. any call to get the nanos in between those updates will result in 1 nano more than the last call being returned so that monotonic time is maintained.

prp commented 4 years ago

As a defensive measure, we should allow the clock resolution for the enclave to be set to a value that is larger than 5ms. Also, on the host side, the 5ms should be configurable. All of this is probably a low priority issue, so for now, we could use a single hard-coded clock resolution setting.

What is not clear to me is how the Linux kernel calculates the clock resolution. Currently we return the clockres by intercepting the syscall, which we need to change.

wintersteiger commented 4 years ago

That's a good question @prp, I'll try and see if I can find out how the kernel does that.

SeanTAllen commented 4 years ago

Re the 5ms, there's an issue open to making that configurable:

https://github.com/lsds/sgx-lkl/issues/185

And it isn't currently 5ms its 500 micros as anything more and the LTP tests for time start to fail.

prp commented 4 years ago

@wintersteiger I found this:

Unfortunately, clock_getres() POSIX function does not report timers granularity in Linux. It can return only two predefined results: either 1/HZ for low resolution clocks (where HZ is value of CONFIG_HZ macro used when configuring linux kernel, typical values are 100 300 1000) or 1 ns for high resolution clocks (hrtimers).

I don't know if the above statement is correct but, if so, we can just remove the SYS_clock_getres bypass and have the syscall be handled by LKL without further set-up from the host.

wintersteiger commented 4 years ago

Yeah, I found those macros in the kernel sources too, it looks to be completely static.

davidchisnall commented 4 years ago

I doubt very much code depends on this (I have never seen a call to this function for anything other than debugging) but we should make sure that we document it as a potential source of incompatibility.

prp commented 4 years ago

@davidchisnall if we have LKL handle the clock_getres syscall, we shouldn't have a compatibility issue. The only problem is that, in an adversarial scenario, an attacker can make the clock precision arbitrarily low, but I don't see an obvious way how we could encode this in the value returned by clock_getres().

wintersteiger commented 4 years ago

We could set it to the maximum value (seconds = UINT_MAX?). I can image that the host being in control of the ticker-threads is exploitable, but since I've never seen a use of clock_getres either, I can't tell whether it could be exploitable in a meaningful way.

prp commented 4 years ago

This may actually break applications. I don't think that applications are generally designed to handle this in a meaningful way.

I suggest to remove the sys_clock_res bypass, remove any of the host/enclave clock resolution code and just document this as a potential security issue.