wolfcw / libfaketime

libfaketime modifies the system time for a single application
https://github.com/wolfcw/libfaketime
GNU General Public License v2.0
2.62k stars 319 forks source link

Draft: Add support for tapered, precise, absolute faketime start time #456

Open aalekseyev opened 5 months ago

aalekseyev commented 5 months ago

FAKETIME_KEEP_BEFORE_NSEC_SINCE_EPOCH behaves similar to FAKETIME_START_AFTER_SECONDS, with two main differences:

The reason we want this feature is the following use case.

We run a large test suite under faketime. That test suite has access to filesystem artifacts that were created prior to test start up. Among those artifacts are some caches which are considered up to date iff the timestamps of the files match what's recorded in a data structure.

This means that to access those caches to be considered valid we need their timestamps to not be rewritten.

The reason we can't use FAKETIME_START_AFTER_SECONDS directly is that the test suite consists of multiple processes, for those processes to correctly interact with each other they need a consistent timestamp mapping that is shared between them. In fact the simplest bash script already behaves incorrectly because the commands use different process start times.

touch old
FAKETIME=+100d FAKETIME_START_AFTER_SECONDS=0 bash -c 'touch new; stat old new'

The expected behavior is that the timestamp of old is not rewritten, while the timestamp of new is rewritten.

That is in fact achievable now:

touch old
now_ns=$(date +%s.%N | sed -r 's_\.__')
FAKETIME=+100d FAKETIME_KEEP_BEFORE_NSEC_SINCE_EPOCH="$now_ns" bash -c 'touch new; stat old new'
aalekseyev commented 5 months ago

Sorry if this isn't exactly idiomatic or clean (in particular long long nanoseconds may be considered a code smell or non-portable?), but I thought I'd start somewhere. Basically, we have a use case that I expected to be a very common use case, where FAKETIME_START_AFTER_SECONDS almost works well, but not quite.

I'd like to have some way of "starting" rewrites at a given time for a whole process tree, not an individual process, which seems to be a limitation in FAKETIME_START_AFTER_SECONDS.

aalekseyev commented 5 months ago

I keep finding more problems with this approach this PR is not solving. So far I fixed the way utimes family works, but I've ran into a fundamental limitation caused by step transition of time. I'm now looking into implementing a "tapered" transition instead of a step transition. I'll close the PR for now, but I'll keep you posted.

wolfcw commented 5 months ago

Yes, FAKETIME_START_AFTER_SECONDS works per process. You might consider turning an absolute timestamp into a relative one during initialisation, basically resulting in a smaller value for each consecutively spawned child process.

The other aspect is getting everything to nanosecond resolution. Certainly doable, but probably needs changes in several places.

aalekseyev commented 5 months ago

@wolfcw, thanks for the quick response and for your suggestions, and sorry for a storm of increasingly ad-hoc patches.

I think tapering will be highly desired for us, after all, since without it programs are more likely to notice the weird time jump, even if it happens before process startup. (programs notice it if they run touch -d "7 days ago", for example).

Should I open a separate issue to discuss what's the best way to support tapered start?

As a quick introduction of what tapering is, unless that's already clear: it's when the time mapping instead of a jump transition uses a gradual transition, see code below. The point being that such mapping is reversible (up to some loss of precision), so you need really pedantic tests to run into issues.

int fake(int offset, int taper_begin, int taper_end, int time) {
  if (time <= taper_begin) { 
    return time;
  }
  if (time >= taper_end) {
    return time + offset;
  }
  // interpolate between (taper_begin, taper_begin) and (taper_end, taper_end+offset)
  int t = time - taper_begin;
  int w = taper_end - taper_begin in
  int h = w + offset in
  return taper_begin + t * h / w;
}

In the branch of this PR I seem to have a working prototype, but there are probably many reasons you don't want to take that code as-is.