Feh / nocache

minimize caching effects
BSD 2-Clause "Simplified" License
554 stars 53 forks source link

Off-by-one issue with NOCACHE_MAX_FDS #42

Closed shadjiiski closed 4 years ago

shadjiiski commented 4 years ago

As discussed here, NOCACHE_MAX_FDS is treated as an artificial cap for RLIMIT_NOFILE. Therefore it should follow the same convention for its value - it should be one greater than the maximum file descriptor we are interested in. Currently, this is not the case, the value of NOCACHE_MAX_FDS is now 2 greater than the maximum file descriptor that will be handled by nocache.

This can easily be proven by adding this simple test to maxfd.t:

t "env NOCACHE_MAX_FDS=4 LD_PRELOAD=../nocache.so cat testfile.$$ >/dev/null && ! ../cachestats -q testfile.$$" "file is not in cache because it has an FD < 4"

Since FDs 0, 1 and 2 are stdin, stdout and stderr, the testfile should have FD 3. Therefore, setting NOCACHE_MAX_FDS to 4 should make sure the test passes. It does not. Setting it to 5 grants a passing test. Reverting this line also resolves the problem.

pavlinux commented 4 years ago

Are you sure that setlimit works?

$ env NOCACHE_MAX_FDS=1 nocache ./a.out; soft limit= 16384 hard limit= 16384 soft limit= 100 hard limit= 1024

getrlim.txt

shadjiiski commented 4 years ago

@pavlinux as far as I understand you were left with the impression that the resource limits for the child process would be amended by setting NOCACHE_MAX_FDS.My understanding is that NOCACHE_MAX_FDS only controls the number of file descriptors that nocache would posix_fadvise on (the lesser between that env var and the current RLIMIT_NOFILE hard limit will be used). It would not change resource limits neither for itself nor for the child process. (see here)

Feh commented 4 years ago

Stanislav, thanks for digging into this. I think in the default settings it doesn’t matter if the value is 2^20 or 2^20-1, but after re-reading man getrlimit I agree with you that it would be better if the max fd number follows the same semantics as the RLIMIT_NOFILE limit (i.e. not count FDs, but instead denote an exclusive upper bound on the FD number).

However, I belive your pull request introduces the possiblity of an out-of-bounds array access: fds is initialized with max_fds elements, which means it can store a FD with number 0..max_fds-1. Your pull request changes the semantics bound, but doesn’t grow the memory allocation by 1 (here). Can you amend the PR if you agree?

shadjiiski commented 4 years ago

Julius, I am not entirely sure about this. The man page reads that the RLIMIT_NOFILE is a value one greater than the maximum file descriptor number that can be opened by the process. Since I believe file descriptors are zero-based, this would mean that it matches the number of supported file descriptors. With this in mind, the current referenced allocation

fds_lock = malloc(max_fds * sizeof(*fds_lock));

makes sense to me - it would support a number of locks matching the number of file descriptors. Furthermore, I believe the fds_lock is always accessed in a fashion similar to this:

for(i = 0; i < max_fds; i++) 

so I am not sure I see where the out of bounds access may happen. However, I am not that used to C so I may be missing a key point, please follow up.

Feh commented 4 years ago

Sorry for the delay. I’ve taken another look. Recapping, we have the array

    fds_lock = malloc(max_fds * sizeof(*fds_lock));

and the companion

    fds = malloc(max_fds * sizeof(*fds));

For both these arrays, we must maintain that the array index is between 0 and max_fds - 1, inclusive. In the case of a loop like for(i=0; i<max_fds; i++) we’re good as you mention, so what’s left is inspect all other call sites where we don’t iterate.

  1. Here onwards in store_pageinfo().
  2. Here onwards in free_unclaimed_pages().

In both cases, the functions have an if(fd >= ...) block at the top. Because we must have fd < max_fds, we should return when fd >= max_fds. The check in free_unclaimed_pages() here is already correct, and your patch is simply changing the check to be correct in the store_pageinfo() case.

Again, sorry about the delay, and thanks for contributing! Merging your change.

Feh commented 4 years ago

Merged #43