Closed cyphar closed 1 week ago
@tonistiigi Based on my testing, this fixes the issue you described as well as having a few other fixes that should improve other possible performance regressions (all mainly dealing with non-existent files).
I can cook up a few tests that call needsScan
and scanPath
directly if those are the kind of tests you'd like to add? (It's not really possible to test this issue with Checksum
directly.)
pathSet
is the simplest way of implementing the structure needed for the prefix checks. If you feel a more complex structure is needed, let me know -- but because we do path lookups from left-to-right now, in practice the very early ancestors will be at the head of the array of prefixes so I don't think it needs anything more complicated than a simple array.
I guess another radix tree would work well for this but assuming the length of prefixes array is always expected to be small, the potentially inefficient lookup in includes shouldn't matter for practical cases.
Ah yeah, go-immutable-radix
has LongestPrefix
which would work for this. But yeah, I think in practice it won't matter (and I'm not sure that it would be better for most cases anyway because this use is not write-few-read-many, it's write-many-read-few, so the copies in each Insert
are probably not worth it in practice).
That sgtm, but we could also add some private counters logic using private variables that can be turned on by the tests so we can see how many times the scanning/walking happens for certain conditions and detect if some future change causes the more expensive scanning part happen more often than expected.
Already working on the needsScan
tests (hence the draft). I'll add some counters as well while I'm at it, though I suspect that testing needsScan
should be sufficient.
Commit f724d6fb0504 ("contenthash: implement proper Linux symlink semantics for needsScan") fixed issues with needScan's handling of symlinks, but the logic used to figure out if a parent path is in the cache was incorrect in a couple of ways:
Fixes: f724d6fb0504 ("contenthash: implement proper Linux symlink semantics for needsScan") Fixes #5042 Signed-off-by: Aleksa Sarai cyphar@cyphar.com