longsleep / linux-pine64

Pine64 Linux Kernel
Other
111 stars 104 forks source link

clock_gettime CLOCK_MONOTONIC is not as monotonic increasing as Rust expects it to be #44

Closed marcust closed 7 years ago

marcust commented 7 years ago

So, I'm not totally sure where to put this, but I think it might be a Pine64/Kernel related issue.

I have a problem with random crashes of the rust compiler (1.13) on my Pine64:

error: internal compiler error: unexpected panic

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

thread 'rustc' panicked at 'other was less than the current instant', /buildslave/rust-buildbot/slave/beta-dist-rustc-cross-host-linux/build/src/libstd/sys/unix/time.rs:276
note: Run with `RUST_BACKTRACE=1` for a backtrace.

As the message "other was less than the current instant" struck me as quite odd I had a look at the source code there and apparently they make sure that two subsequent calls to clock_gettime(CLOCK_MONOTONIC... always have a positive delta (https://github.com/rust-lang/rust/blame/master/src/libstd/sys/unix/time.rs#L276, commit is https://github.com/rust-lang/rust/commit/c6eb8527e09bed702f30ffdc8d6e54acf9b867ca)

According to the Linux docs that should be the case:

CLOCK_MONOTONIC
Clock that cannot be set and represents monotonic time since some unspecified starting point.

So this made me take out my rusty C knowledge to test that out:

#include <stddef.h>
#include <stdio.h>
#include <time.h>
#include <sys/sysinfo.h>

struct timespec diff(struct timespec start, struct timespec end)
{
  struct timespec temp;
  if ((end.tv_nsec-start.tv_nsec)<0) {
    temp.tv_sec = end.tv_sec-start.tv_sec-1;
    temp.tv_nsec = 1000000000+end.tv_nsec-start.tv_nsec;
  } else {
    temp.tv_sec = end.tv_sec-start.tv_sec;
    temp.tv_nsec = end.tv_nsec-start.tv_nsec;
  }
  return temp;
}

int main(void){
  int i = 0;
  struct timespec monotime;
  struct timespec oldtime;
  struct timespec delta; 

  for (i = 0; i <= 1000000; i++) {
    clock_gettime(CLOCK_MONOTONIC, &monotime);

    if (oldtime.tv_sec > 0) {
      delta = diff(oldtime, monotime);
      printf("\rRounds: %i", i);
      if (delta.tv_sec < 0 || delta.tv_nsec < 0) {
        printf("\n");
        printf("Now (%lld.%.9ld) is before old (%lld.%.9ld)\n", (long long)monotime.tv_sec, monotime.tv_nsec, (long long)oldtime.tv_sec, oldtime.tv_nsec);
        return 1;
      }

    }

    oldtime = monotime;

  }

  printf("\n");
  return 0;
}

Having some load I can reproduce that:

Rounds: 211246
Now (86066.040429675) is before old (86066.040514135)
 18:03:05 up 23:54,  3 users,  load average: 2.04, 2.07, 2.35
Rounds: 16864
Now (86071.079399025) is before old (86071.079483486)
 18:03:10 up 23:54,  3 users,  load average: 2.20, 2.10, 2.36
Rounds: 12103
Now (86072.960532639) is before old (86072.960617099)
 18:03:12 up 23:54,  3 users,  load average: 2.34, 2.14, 2.37
Rounds: 10335
Now (86075.120457845) is before old (86075.120542306)
 18:03:14 up 23:54,  3 users,  load average: 2.34, 2.14, 2.37
Rounds: 72514
Now (86091.604185538) is before old (86091.604269999)
 18:03:31 up 23:54,  3 users,  load average: 2.85, 2.26, 2.41
Rounds: 41445
Now (86103.709183057) is before old (86103.709267517)
 18:03:43 up 23:55,  3 users,  load average: 3.41, 2.41, 2.46
Rounds: 1000000
 18:10:41 up 1 day, 2 min,  3 users,  load average: 0.17, 0.87, 1.73
Rounds: 1000000
 18:15:03 up 1 day, 6 min,  2 users,  load average: 0.09, 0.42, 1.33

So my question is: is this a bug? Should that always be monotonic increasing? I have OpenNTPd running, but as this clock should not be settable this should not be an issue (plus NTP log messages in syslog do not correlate with my tests)

I'm running

Linux campine 3.10.104-1-pine64-longsleep #103 SMP PREEMPT Sun Oct 23 13:09:48 CEST 2016 aarch64 GNU/Linux
longsleep commented 7 years ago

I am not sure what exactly is going wrong here. But even if the Kernel is to blame i am not planning to do anything about it in the BSP Kernel. Please try to reproduce with mainline Kernel 4.9 or higher. If its an issue still, it should be fixed there.

zador-blood-stained commented 7 years ago

@longsleep I would guess it has something to do with A64 arch timer bug https://irclog.whitequark.org/linux-sunxi/2016-11-04#18097090 http://lists.denx.de/pipermail/u-boot/2016-November/271836.html

longsleep commented 7 years ago

Oh my, reopening this - i am open for patches or links to patches :)