Closed yakra closed 6 months ago
/*static*/ void HighwaySystem::VertexThread(unsigned int id, std::mutex* mtx)
{ //printf("Starting HighwaySystem::VertexThread %02i\n", id); fflush(stdout);
while (it != syslist.end())
{ for (mtx->lock(); it != syslist.end(); it++)
if (it->is_subgraph_system) break;
if (it == syslist.end()) return mtx->unlock();
HighwaySystem& h = *it++;
mtx->unlock();
//std::cout << '.' << std::flush;
for (Route& r : h.routes) // Yes, a system can get the same vertex multiple times
for (Waypoint& w : r.points) // from different routes, or loop endpoints.
h.vertices.push_back(w.hashpoint()->vertex);// But HighwayGraph::write_subgraphs_tmg gets rid
} // of any redundancy when making the final set.
}
@jteresco, I came across something interesting when working on this.
Here's the relevant part of a work-in-progress Region::VertexThread
:
for (Route* r : rg.routes) // Yes, a region can get the same
for (Waypoint& w : r->points) // vertex multiple times from
if (w.is_or_colocated_with_active_or_preview()) // different routes, or loop endpoints.
{ Waypoint* p = w.hashpoint(); // But HighwayGraph::write_subgraphs_tmg gets rid
rg.vertices.emplace_back(p->vertex, p); // of any redundancy when making the final set.
}
It produces the same* graphs as existing siteupdate. (*meaning, same vertices & edges, but in a different order. Run them thru canonicaltmg and there's no diff.)
Now compare what I did originally:
for (Route* r : rg.routes) // Yes, a region can get the same
if (r->system->active_or_preview()) // vertex multiple times from
for (Waypoint& w : r->points) // different routes, or loop endpoints.
{ Waypoint* p = w.hashpoint(); // But HighwayGraph::write_subgraphs_tmg gets rid
rg.vertices.emplace_back(p->vertex, p); // of any redundancy when making the final set.
}
This does the active/preview check at the Route level, without having to check every single Waypoint.
Diffs (HwyData @ 1f85330
):
Worldwide, the bottom version omitted 5 points in 4 graphs in 3 regions. Check`em out in the HDX.
M3@COD/ZMB
andM3Muf@ZMB/COD
in the narrow southeastern part of the country that protrudes into Zambia.RN4@BDI/COD
about a mile N of Lake Tanganyika. Gotta zoom in a bit to see it clearly.H10@NPL/IND
about halfway along the India/Nepal borderAH21@VNM/KHM
is easy to spot at the default zoom level in my browser.These vertices were excluded because there's no active/preview Waypoint in the relevant region itself.
They "do and do not" belong in these regions. I can see arguments for or against including them.
Looking at this another way...
Here's part of the HGVertex
constructor in the current production code:
https://github.com/yakra/DataProcessing/blob/fd96ac64ba2eff23600e86df8dcc84e94cb0a916/siteupdate/cplusplus/classes/GraphGeneration/HGVertex.cpp#L30-L35
Looks like siteupdate.py originally behaved the same way -- the a/p check wasn't done, and region graphs included vertices with only devel points there.
If this were instead
for (Waypoint *w : *(wpt->colocated))
{ // will consider hidden iff all colocated waypoints are hidden
if (!w->is_hidden) visibility = 2;
if (w->route->system->active_or_preview())
{ w->route->region->add_vertex(this, wpt);// Yes, a region/system can get the same vertex multiple times
w->route->system->add_vertex(this); // from different routes. But HighwayGraph::write_subgraphs_tmg
} // gets rid of any redundancy when making the final set.
}
...or if the original siteupdate.py had had an equivalent check, we'd be getting the same results observed above.
Any opinion on whether these points should be included or excluded?
The more I think about it, the more I lean toward exclusion. We should be consistent in the "active/preview only" approach, and treat devel systems like they don't exist at all. Oh, and simpler more efficient code is a plus too! ;)
Canonical Waypoint Name simplification is something to consider too -- "M3@COD/ZMB" indicates just one route (as there was only 1 route in the a/p coloc list). And M3 only exists in ZMB, not COD.
If the codn
devel system didn't exist, M3@COD/ZMB
would exist, with the same name, in ZMB only.
For curiosity's sake, here's an alternate "check every point" solution that inlines the functionality of is_or_colocated_with_active_or_preview()
and hashpoint()
, using one colocated
check that'd otherwise be done for each. Also saves the conditional on the is_or_coloc...
call itself.
for (Route* r : rg.routes)
for (Waypoint& w : r->points)
if (w.colocated)
{ for (Waypoint *c : *w.colocated)
if (c->route->system->active_or_preview())
{ Waypoint* p = w.colocated->front();
rg.vertices.emplace_back(p->vertex, p);
break;
}
}
else if (w.active_or_preview())
rg.vertices.emplace_back(w.vertex, &w);
In reality, there's little to no point in using this. Better to just go whole hog & either...
else // bring in devel points coloc with a/p at regional boundaries
for (Waypoint& w : r->points)
if (w.colocated)
for (Waypoint *c : *w.colocated)
if (c->route->system->active_or_preview())
{ Waypoint* p = w.colocated->front();
rg.vertices.emplace_back(p->vertex, p);
break;
}
In addition to not having to check every point, for this 2nd loop we know these Waypoints are devel, thus we know we have to look for a/p in the coloc list. Nothing to do for singleton points.
Another reason I think omitting the vertices is justified: In graphs restricted by system, edge names only include the route names from the relevant systems. This is similar in that info not relevant to the graph's restrictions is omitted.
Let's just not think too hard about the counterexample, vertex unique_name
s. ;) These are the same across the board, in tm-master & all subgraphs, regardless of any restrictions by region or system. Doing otherwise would require a lot of extra computation for little questionable gain. :)
Using region/system TMBitsets for subgraph membership will require a separate pass through those objects to allocate space after the number of vertices is known. It could happen as early as here, then proceeding to populate as we do now, with
HGVertex
construction callingadd_vertex
(which would just need to do a TMBitset insertion instead of a vector emplacement). ...But that's dumb. If weshrink_to_fit
, there'd still need to be another pass after population. Better yet...add_vertex
functions.unique_name
sets & mutexes?compute_stats
and HighwaySystem'sUbuntu_16_04_7_LTS
padding?~Ubuntu_16_04_7_LTS
padding.TMBitset::shrink_to_fit()
A benefit here of one pass after construction is, we shrink one region/system's bitset before allocating the next. We don't allocate a boatload of extra RAM in an initial pass that takes up space while vertices are created. What's faster?shrink_to_fit
with min & max hints after 1 passsize_t align = min_index%ubits; min -= align; len += align;
shrink_to_fit
work on the TMBitset itself with no extra data passed in. Just find the first & last set bit and off we go. Hey, at least the branching while we're finding them will be 100% predictable!