Closed yakra closed 9 months ago
Looked at the effects of checking visible points only when doing some benchmarks for generalizing strdcmp, which got a mention at TravelMapping#600. Posting here to keep the spam out of the official more public repo.
At first I thought the branch misprediction penalty may outweigh the penalty of the extraneous datacheck, but now it's looking like that's probably not the case.
Existing strdcmp
, tailored for use with LABEL_SELFREF specifically:
int strdcmp(const char* a, const char* b, const char d)
{ do if (!*b) return *a == d ? 0 : *a-*b;
while (*a++ == *b++);
return *--a - *--b;
}
Generalized strdcmp
, for wider applicability:
int strdcmp(const char* a, const char* b, const char d)
{ do if (!*b || *b==d) return (!*a || *a==d) ? 0 : *a;
else if (!*a || *a==d) return -*b;
while (*a++ == *b++);
return *--a - *--b;
}
The benchmark goes between Sorting waypoints in Quadtree
and Finding unprocessed wpt files
:
size_t accum = 0;
cout << et.et() << "AccumBegin" << endl;
for (HighwaySystem& h : HighwaySystem::syslist)
for (Route& r : h.routes)
for (Waypoint& w : r.points)
if (!w.is_hidden) // comment/uncomment this line as appropriate for benchmarking
{ const char* l = w.label.data() + (w.label[0] == '*');
string rte = w.route->banner[0] == '-' ? w.route->route : w.route->name_no_abbrev();
// first check for number match after a slash, if there is one
const char* slash = strchr(l, '/');
if (slash++ && strncmp(l, "Old", 3))
{ const char* digits = rte.data();
while (!isdigit(*digits) && *digits) ++digits;
if ( digits != rte.data()+rte.size()
&& !strdcmp(slash, digits, '_') && !w.coloc_same_number(digits)
|| !strdcmp(slash, rte.data(), '_')
) ++accum;
}
}
cout << et.et() << "AccumEnd @ " << accum << endl;
return 0;//*/
(For use with older commits, &
-> *
, points
-> point_list
, and w.
-> w->
as needed.)
The flavors of siteupdate_{P,R}{A,V}{S,G}:
strdcmp
strdcmp
4c8a0c9
, most TMArray conversion done but not Regions or Waypoints.c7aa426
, TMArray conversion essentially done, just not squashed.The results: | spec | gen | |
---|---|---|---|
ptr all | .1387 | .1397 | |
ptr vis | .1410 | .1413 | |
ref all | .0876 | .0873 | |
ref vis | .0869 | .0875 |
strdcmp
more times, for all 182 hidden points with a slash. :P
Yes. Let's go ahead and generalize strdcmp
.A vs V:
When first benchmarking before TMArray-ifying Waypoints, this looked like a loss (branch misprediction penalty?), taking 1.6 - 2.3 ms longer.
Now? Within margin of error (+0.2 ms) or even faster (-0.7 ms).
Also worth noting is that the benchmark code above doesn't include the rest of the datacheck after the slash test, which includes a strncmp fort every point without a post-slash match. Add this in and limiting to visible points is a more clean win: |
spec | gen | |
---|---|---|---|
ref all | .0964 | .0962 | |
ref vis | .0926 | .0935 |
Besides... It's The Right Thing To Do!
Check LABEL_SELFREF for visible points only. It's currently checked for all points.