yakra / DataProcessing

Data Processing Scripts and Programs for Travel Mapping Project
0 stars 0 forks source link

Check LABEL_SELFREF for visible points only #246

Closed yakra closed 9 months ago

yakra commented 1 year ago

Check LABEL_SELFREF for visible points only. It's currently checked for all points.

yakra commented 11 months ago

Looked at the effects of checking visible points only when doing some benchmarks for generalizing strdcmp, which got a mention at TravelMapping#600. Posting here to keep the spam out of the official more public repo.

At first I thought the branch misprediction penalty may outweigh the penalty of the extraneous datacheck, but now it's looking like that's probably not the case.

Existing strdcmp, tailored for use with LABEL_SELFREF specifically:

int strdcmp(const char* a, const char* b, const char d)
{   do  if (!*b) return *a == d ? 0 : *a-*b;
    while   (*a++ == *b++);
    return  *--a - *--b;
}

Generalized strdcmp, for wider applicability:

int strdcmp(const char* a, const char* b, const char d)
{   do  if  (!*b || *b==d) return (!*a || *a==d) ? 0 : *a;
        else if (!*a || *a==d) return -*b;
    while   (*a++ == *b++);
    return  *--a - *--b;
}

The benchmark goes between Sorting waypoints in Quadtree and Finding unprocessed wpt files:

size_t accum = 0;
cout << et.et() << "AccumBegin" << endl;
for (HighwaySystem& h : HighwaySystem::syslist)
  for (Route& r : h.routes)
    for (Waypoint& w : r.points)
      if (!w.is_hidden) // comment/uncomment this line as appropriate for benchmarking
      { const char* l = w.label.data() + (w.label[0] == '*');
    string rte = w.route->banner[0] == '-' ? w.route->route : w.route->name_no_abbrev();
    // first check for number match after a slash, if there is one
    const char* slash = strchr(l, '/');
    if (slash++ && strncmp(l, "Old", 3))
    {   const char* digits = rte.data();
        while (!isdigit(*digits) && *digits) ++digits;
        if (    digits != rte.data()+rte.size()
             && !strdcmp(slash, digits, '_') && !w.coloc_same_number(digits)
             || !strdcmp(slash, rte.data(), '_')
           )    ++accum;
    }
      }
cout << et.et() << "AccumEnd @ " << accum << endl;
return 0;//*/

(For use with older commits, & -> *, points -> point_list, and w. -> w-> as needed.)

The flavors of siteupdate_{P,R}{A,V}{S,G}:

The results: spec gen
ptr all .1387 .1397
ptr vis .1410 .1413
ref all .0876 .0873
ref vis .0869 .0875