yakra / DataProcessing

Data Processing Scripts and Programs for Travel Mapping Project
0 stars 0 forks source link

HighwaySystem::total_mileage() #233

Open yakra opened 1 year ago

yakra commented 1 year ago

TravelerList::userlog calls HighwaySystem::total_mileage() twice for every user with mileage in that system. Lots of redundant summation. Additionally, it's possible that iterating thru an unordered_map may be expensive. Especially in FreeBSD? Instead, store a variable in the HighwaySystem object. Calculate during Route::compute_stats_r()?

Previous changes stashed @ 8ccb8b7c27e35351a1d6845a16dfa69ba1e523e6 2022-10-19

yakra commented 1 year ago

Calculate during Route::compute_stats_r()?

Can't exactly do that if Route::compute_stats_r() is going the way of the dodo. ;) Whether calculating in Route::compute_stats_r() or Region::compute_stats(), a mutex will be required in some fashion:

A more advanced variation: Routes in Region::routes are grouped by HighwaySystem. Take advantage of this. Instead of locking/unlocking for every segment, first HighwaySystem* prev_system = routes.size() ? routes[0]->system : nullptr;, then while iterating through routes,

if (r->system != prev_system)
{   prev_system->mtx.lock();
    prev_system->total_mileage += prev_mileage;
    prev_system->mtx.unlock();
}   // this could even be its own member function
prev_system = r->system;
prev_mileage = &system_mileage;

...and finally once more when done iterating (sans the last 2 lines of course).

Another simpler option: Calculate & store the data when writing highwaydatastats.log. Single-threaded, but less mucking about to get there. No mutexes. Shouldn't be any tangible amount slower than no-build. How's speed compare?

All that said, this is low priority. Previous benchmarking indicates this doesn't save a tangible amount of time anyway.

yakra commented 1 year ago

^ A framework for the "more advanced variation" to fit into. Tried this out in order to reduce unnecessary r->system->mileage_by_region[this] lookups, but it ran 0.0154 s slower on BiggaTomato @ 4 threads.

diff --git a/siteupdate/cplusplus/classes/Region/compute_stats.cpp b/siteupdate/cplusplus/classes/Region/compute_stats.cpp
index d438b16..c9137ae 100644
--- a/siteupdate/cplusplus/classes/Region/compute_stats.cpp
+++ b/siteupdate/cplusplus/classes/Region/compute_stats.cpp
@@ -9,8 +9,10 @@

 void Region::compute_stats()
 {   std::cout << '.' << std::flush;
+    HighwaySystem* prev_system = nullptr;
+    double* prev_mileage = nullptr;
     for (Route* const r : routes)
-    {  double& system_mileage = r->system->mileage_by_region[this];
+    {  double& system_mileage = r->system == prev_system ? *prev_mileage : r->system->mileage_by_region[this];
        for (HighwaySegment *s : r->segment_list)
        {       // always add the segment mileage to the route
                r->mileage += s->length;
@@ -56,6 +58,8 @@ void Region::compute_stats()
                        t->system_region_mileages[r->system][this] += s->length/system_concurrency_count;
                }
        }
+       prev_system = r->system;
+       prev_mileage = &system_mileage;

        // datachecks
        #define CSV_LINE r->system->systemname + ".csv#L" + std::to_string(r->system->route_index(r)+2)