Closed jarkkojs closed 4 months ago
@jarkkojs CI fails with the following:
########## clang-tidy-15 ##########
72 warnings generated.
101 warnings generated.
101 warnings generated.
/root/project/src/nnn.c:1761:2: error: Assigned value is garbage or undefined [clang-analyzer-core.uninitialized.Assign,-warnings-as-errors]
size_t startpos = (size_t)marked[i].startpos;
^
/root/project/src/nnn.c:1788:20: note: Uninitialized value stored to field 'startpos'
selmark *marked = malloc(2 * nselected * sizeof(selmark));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/root/project/src/nnn.c:1790:6: note: Assuming 'marked' is non-null
if (!marked) {
^~~~~~~
/root/project/src/nnn.c:1790:2: note: Taking false branch
if (!marked) {
^
/root/project/src/nnn.c:1796:14: note: Assuming 'i' is >= 'ndents'
for (i = 0; i < ndents; ++i) {
^~~~~~~~~~
/root/project/src/nnn.c:1796:2: note: Loop condition is false. Execution continues on line 1844
for (i = 0; i < ndents; ++i) {
^
/root/project/src/nnn.c:1844:2: note: Calling 'sort_array'
sort_array(marked, &marked[nselected], nselected, sizeof(selmark),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/root/project/src/nnn.c:1736:2: note: Loop condition is true. Entering loop body
for (b = 0; b < 8; b++) {
^
/root/project/src/nnn.c:1740:15: note: Assuming 'i' is < 'nelem'
for (i = 0; i < nelem; i++)
^~~~~~~~~
/root/project/src/nnn.c:1740:3: note: Loop condition is true. Entering loop body
for (i = 0; i < nelem; i++)
^
/root/project/src/nnn.c:1741:8: note: Calling 'selmark_byte'
cnt[byte(data, i, b)]++;
^~~~~~~~~~~~~~~~
/root/project/src/nnn.c:1761:2: note: Assigned value is garbage or undefined
size_t startpos = (size_t)marked[i].startpos;
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
/root/project/src/nnn.c:1788:27: error: performing an implicit widening conversion to type 'unsigned long' of a multiplication performed in type 'int' [bugprone-implicit-widening-of-multiplication-result,-warnings-as-errors]
selmark *marked = malloc(2 * nselected * sizeof(selmark));
^
/root/project/src/nnn.c:1788:27: note: make conversion explicit to silence this warning
selmark *marked = malloc(2 * nselected * sizeof(selmark));
^~~~~~~~~~~~~
(unsigned long)( )
/root/project/src/nnn.c:1788:27: note: perform multiplication in a wider type
selmark *marked = malloc(2 * nselected * sizeof(selmark));
^
(long)
Suppressed 102 warnings (99 in non-user code, 3 NOLINT).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
2 warnings treated as errors
Exited with code exit status 1
@jarkkojs CI fails with the following:
########## clang-tidy-15 ########## 72 warnings generated. 101 warnings generated. 101 warnings generated. /root/project/src/nnn.c:1761:2: error: Assigned value is garbage or undefined [clang-analyzer-core.uninitialized.Assign,-warnings-as-errors] size_t startpos = (size_t)marked[i].startpos; ^ /root/project/src/nnn.c:1788:20: note: Uninitialized value stored to field 'startpos' selmark *marked = malloc(2 * nselected * sizeof(selmark)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /root/project/src/nnn.c:1790:6: note: Assuming 'marked' is non-null if (!marked) { ^~~~~~~ /root/project/src/nnn.c:1790:2: note: Taking false branch if (!marked) { ^ /root/project/src/nnn.c:1796:14: note: Assuming 'i' is >= 'ndents' for (i = 0; i < ndents; ++i) { ^~~~~~~~~~ /root/project/src/nnn.c:1796:2: note: Loop condition is false. Execution continues on line 1844 for (i = 0; i < ndents; ++i) { ^ /root/project/src/nnn.c:1844:2: note: Calling 'sort_array' sort_array(marked, &marked[nselected], nselected, sizeof(selmark), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /root/project/src/nnn.c:1736:2: note: Loop condition is true. Entering loop body for (b = 0; b < 8; b++) { ^ /root/project/src/nnn.c:1740:15: note: Assuming 'i' is < 'nelem' for (i = 0; i < nelem; i++) ^~~~~~~~~ /root/project/src/nnn.c:1740:3: note: Loop condition is true. Entering loop body for (i = 0; i < nelem; i++) ^ /root/project/src/nnn.c:1741:8: note: Calling 'selmark_byte' cnt[byte(data, i, b)]++; ^~~~~~~~~~~~~~~~ /root/project/src/nnn.c:1761:2: note: Assigned value is garbage or undefined size_t startpos = (size_t)marked[i].startpos; ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~ /root/project/src/nnn.c:1788:27: error: performing an implicit widening conversion to type 'unsigned long' of a multiplication performed in type 'int' [bugprone-implicit-widening-of-multiplication-result,-warnings-as-errors] selmark *marked = malloc(2 * nselected * sizeof(selmark)); ^ /root/project/src/nnn.c:1788:27: note: make conversion explicit to silence this warning selmark *marked = malloc(2 * nselected * sizeof(selmark)); ^~~~~~~~~~~~~ (unsigned long)( ) /root/project/src/nnn.c:1788:27: note: perform multiplication in a wider type selmark *marked = malloc(2 * nselected * sizeof(selmark)); ^ (long) Suppressed 102 warnings (99 in non-user code, 3 NOLINT). Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well. 2 warnings treated as errors Exited with code exit status 1
I'll check what is going on, will return soon'ish.
~/work/nnn sort-marked
❯ clang-tidy src/nnn.c
Error while trying to load a compilation database:
Could not auto-detect compilation database for file "src/nnn.c"
No compilation database found in /home/jarkko/work/nnn/src or any parent directory
fixed-compilation-database: Error while opening fixed database: No such file or directory
json-compilation-database: Error while opening JSON database: No such file or directory
Running without flags.
96 warnings generated.
Suppressed 99 warnings (96 in non-user code, 3 NOLINT).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
I also wrote a better commit message:
O(n) sorting for selection markers: sort_selection()
QuickSort produces O(nlog n) at best and O(n^2) at worst. Sort selection
markers with a trivial radix sort algorithm, which guarantees O(n) time.
As a consequence both time and space consumption will be linear.
Do not generalize the algorithm, as other sites most likely require a
comparison based sorting algorithm. For those sites the goal should be
to eliminate the worst case O(n^2) and have a steady O(nlog n)
performance.
E.g. heap sort is much better option for comparison cases... These set actual guarantees for order of the function (in both time and space dimensions). And further, they are somewhat de-facto choices for e.g. embedded system with low specs, which I guess fit tho this project quite well.
QuickSort produces O(nlog n) at best and O(n^2) at worst.
We use the libc qsort
function. Which, despite the confusing name, isn't required to be actually quicksort. In fact most libcs don't use quicksort for qsort
, glibc uses mergesort for example. (See: https://nullprogram.com/blog/2016/09/05/)
I'll look into the patch. Though I'm skeptical of whether the performance improvement (if any) is actually worthwhile or not. So far I've never run into a situation where sorting was the bottleneck.
@KlzXS do you really see any merit in this change? I would prefer to stick to the library function.
There is some merit to it, yes. How much exactly is debatable.
On a modern system? Probably little to none. Unless you get nselected
up to above 100k or a million I doubt you'd be able to perceive the difference as an actual delay in sorting (I personally consider 20ms or 50ms as the lower limit). Even more so considering the majority of the inversion process is actually spent searching for things in the selection buffer. Optimizing calls to findinsel()
would yield more performance, but I don't see a clear way to do that without changing the way the selection buffer works.
On something like a router or another embedded low power device? It would be better due to the raw difference in processing power. A 100MHz CPU would be happier with radix sort. Though the searching overhead is still there.
@jarkkojs can you please share some performance numbers (along with the processor details, if possible on a regular smartphone)?
If you want to reduce the searching overhead, we could put all the filenames in a hash table and then in one pass through pselbuf
check the hash table each time we encounter the current path. That would also remove the need for sorting altogether since we'd be adding entries to marked
in the order they appear in pselbuf
.
Again, I'm not sure how impactful that would be or if we should even bother improving this function at all other than for the sake of bragging rights for it being "optimal". I'll leave that for you to consider.
OK, I'll close this.
@KlzXS can you please review this?