crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.33k stars 1.61k forks source link

[RFC] Improve File.utime precision by using utimensat() on Unix #9345

Open kubo opened 4 years ago

kubo commented 4 years ago

On Unix, File.utime is implemented by utimes, whose time precision is microsecond. The least three digits of nanosecond part of atime and mtime are zeroed. utimensat, whose time precision is nanosecond, is usable instead of utimes when the special value AT_FDCWD is passed in the fd parameter.

utimensat is supported on the following platforms:

On the other hand, crystal's supported platforms are:

I came up with four options:

  1. Use utimensat on all Unix platforms and drop support for Linux kernel < 2.6.22 and macOS < 10.13.
  2. Use utimensat when LibC.dlsym(LibC::RTLD_DEFAULT, "utimensat") returns non-null. Otherwise use utimes.
  3. Use utimensat on BSD platforms. Use utimes on Linux and macOS while Linux kernel < 2.6.22 and macOS < 10.13 are supported.
  4. Keep using utimes on all Unix paltforms while Linux kernel < 2.6.22 and macOS < 10.13 are supported.

Which is better?

kubo commented 4 years ago
  1. Use utimensat when LibC.dlsym(LibC::RTLD_DEFAULT, "utimensat") returns non-null. Otherwise use utimes.

As far as I checked it works on x86_64-linux-gnu by this patch. I tested it with the following code:

filename = "utime-test.txt"
system("touch #{filename}")

mtime = File.info(filename).modification_time
puts("before File.utime: #{mtime.to_s("%G-%m-%d %H:%M:%S.%9N")}")

File.utime(mtime, mtime + Time::Span.new(nanoseconds: 1), filename)

mtime = File.info(filename).modification_time
puts("after File.utime:  #{mtime.to_s("%G-%m-%d %H:%M:%S.%9N")}")

The timestamp of utime-test.txt was incremented by 1 nanosecond on ext4 file system.

However I'm not sure whether the usage of Proc (here and here) is legal.

RX14 commented 4 years ago

I'm happy with using utimens or utimensat unconditionally. I'd be happy to drop all 2.x kernel support personally, and the other OSes don't seem an issue.

straight-shoota commented 4 years ago

Maybe we should keep the current implementation as default and only use utimensat on platforms that have been tested to work. As far as we know, the current implementation works on all platforms Crystal currently supports. And it probably works on most POSIX platforms that will be supported in the future. When changing that, we should make sure it doesn't break. So it seems like a good idea do keep this as the default implementation and only use the improved one when it's sure that it works.

Having no nanosecond support in utime is not a huge issue for most use cases. But broken utime is an issue for all use cases.

straight-shoota commented 4 years ago

For clarity: I don't mean conditionally switching depending on the kernel version at runtime. It should only be based on the target triple. Since all relevant kernel versions for e.g. x86_64-linux-gnu supports it, I'm fine with dropping support for older kernels. But platforms that have not been tested, should keep the current implementation until it's confirmed that the improved one works.

RX14 commented 4 years ago

I checked, utimensat exists on all platforms we're going to support. It's fine to keep it simple and universally change over. I'd rather not complicate the code for little reasons.

straight-shoota commented 4 years ago

Awesome! Then #9392 should be the way to go.

kubo commented 4 years ago

utimensat is available on POSIX platforms as long as they conform to POSIX.1-2008. After reading https://github.com/crystal-lang/crystal/issues/9345#issuecomment-636505222 I thought that some old POSIX platforms might not conform to it, so I considered that it was reasonable to use utimensat if it was in LibC, otherwise utimes. (So I opened #9394.)

However if Crystal doesn't support such old platforms now and in the future, #9392 is simple as @RX14 commented. It was not tested on some tier-2 platforms though I had checked C header files of each platform and had confirmed that utimensat is not renamed to another symbol. (For example, opendir is renamed to __opendir30 on NetBSD.)