Closed yakra closed 1 year ago
Solving the region.php rankings bug while maintaining good performance means reunifying CompStatsRThread
+ CompStatsTThread
& iterating by region.
Nothing else iterates by region, so the new combined CompStatsThread
stays; there's nothing else to refactor it into.
I'll close this issue and replace it with another one that reflects that reality soon.
OTOH, the LABEL_SELFREF datacheck requires that all .wpts are read, that all colocated points are detected. It has to happen after ReadWptThread. The only other other options are NMPSearchThread & NMPMergedThread, which just don't make sense from an organizational standpoint. I think I'll leave this well enough alone.
compute_stats_r
is where it is simply because it was split off from
compute_stats_t
, and related items were left grouped together. Plus, what eventually becameRteIntThread
didn't exist yet. The datachecks were added in because it was a handy place to put them, in a multi-threaded job iterating thru systems & routes, after .list files were processed. The original plan was to have what became ABBREV_NO_CITY do stuff based on whether a .list name was in use. So, ABBREV_AS_BANNER was added, with the idea that when it eventually came time to implement ABBREV_NO_CITY, I'd just slap anelse
after the whole shebang and go to town. By the time it was implemented, I'd decided against checking .list names in use because There's no way to automatically know whether the error is due to transposed data as opposed to a missing city or extraneous abbrev. TL;DR, this doesn't have to be where it is, after .list processing. It can be refactored intoHighwaySystem::route_integrity
. With each thread now doing other stuff the majority of the time, that could mean less competition for the regional mutexes, and a win for parallelism. (OTOH, doing more stuff at once could mean increased cache misses and thus decreased performance.)compute_stats_r
intoroute_integrity
, where it already iterates https://github.com/yakra/DataProcessing/blob/16721240fb5c3e117b1701b62bfcda6a10fb0c61/siteupdate/cplusplus/classes/HighwaySystem/route_integrity.cpp#L10 We could then get rid ofCompStatsRThread
and the code inmain
that processes it.route_integrity
is where it is because the original switch to efficient-but-destructive one-time case-smashing on AltLabels required it to happen after NMP processing, in order to preserve case in nearmisspoints.log & nmpfps.log and allow FP entries to match without big changes to nmpfps.log. AltLabels have since been removed from these logs entirely, meaning we can now smash case anywhere. The errorcheck for Routes without a ConnectedRoute was lumped in to take advantage of the existing iteration through systems & routes.
Route::read_wpt
(where the UAL set was originally populated) would help out during initial "dry runs" wherehwy_data/
isn't cached, by performing this task while waiting for disk access. This could also ameliorate the potential cache miss effects mentioned above.Route
ctor and intoroute_integrity
(effectively putting it back in its old location after concurrency detection) would move it from a single to multi-threaded part of the program. (Cache misses back, potentially?)shared_mutex
, maybe?HighwaySystem
construction, exact mechanics TBD.HighwaySystem
objects, followed by a threadedread_csv
function, similar to what's done with .wpt files now.systems.csv
lines & have a threaded funtion iterate thru that & call the ctor.HighwaySystem
ctor & get rid ofroute_integrity
entirely.read_wpt
into theHighwaySystem
ctor. We'd then be up 1 thread (system construction) & down 3 (ReadWptThread, RteIntThread, CompStatsRThread), and doing a lot more stuff while waiting for disk access.Low priority
Route
branch, DUPLICATE_LABEL, pointsinuse & unusedaltlabels bugfixes ought to happen first.