TravelMapping / DataProcessing

Data Processing Scripts and Programs for Travel Mapping Project
4 stars 6 forks source link

concurrency detection memory leak #449

Closed yakra closed 3 years ago

yakra commented 3 years ago

Valgrind reports:

yakra commented 3 years ago

colocation lists

N S
pol.a001@26 pol.a001@27
pol.dk074@A1_N pol.dk074@A1_S
pol.e67@26(A1) pol.e67@27(A1)
pol.e67@29(S8) pol.e75@27(A1)
pol.e75@26(A1) pol.s008@27(A1)

pol.s008@26(A1) pol.s008@29

yakra commented 3 years ago
for (HighwaySystem* h : HighwaySystem::syslist)                 // h = eure
{cout << '.' << flush;                              // 
 for (Route* r : h->route_list)                         // r = pol.e67
  for (HighwaySegment* s : r->segment_list)                 // s = [POL E67 29(S8) 27(A1)]
   if (s->waypoint1->colocated && s->waypoint2->colocated)          // both colocated; proceed
    for ( Waypoint* w1 : *(s->waypoint1->colocated) )               // w1 = pol.a001@26
     if (w1->route != r)                            // diff route; proceed
      for ( Waypoint* w2 : *(s->waypoint2->colocated) )             // w2 = pol.a001@27
       if (w1->route == w2->route)                      // same route; proceed
       {HighwaySegment* other = w1->route->find_segment_by_waypoints(w1,w2);    // other = [POL A1 26 27]
        if (other)                              // proceed
         if (!s->concurrent)                            // First pass; s->concurrent still 0; proceed
         {s->concurrent = new list<HighwaySegment*>;                // new list
          other->concurrent = s->concurrent;                    // 2 pointers to the same list
          s->concurrent->push_back(s);                      // E67 S & A1 get E67 S
          s->concurrent->push_back(other);                  // E67 S & A1 get A1
          //concurrencyfile << "New concurrency [" etc.             // 
         }                                  // 
         else //...
       }
E67 S E67 N E75 A1 S8 S S8 N DK74
E67 S 0 0 E67 S 0 0 0
A1 A1
yakra commented 3 years ago
                                                                // w1 = pol.a001@26
      for ( Waypoint* w2 : *(s->waypoint2->colocated) )             // w2 = pol.dk074@A1_S, etc.
       if (w1->route == w2->route)                      // False for all remaining
E67 S E67 N E75 A1 S8 S S8 N DK74
E67 S 0 0 E67 S 0 0 0
A1 A1
yakra commented 3 years ago
                                                            // s = [POL E67 29(S8) 27(A1)]
   if (s->waypoint1->colocated && s->waypoint2->colocated)          // both colocated; proceed
    for ( Waypoint* w1 : *(s->waypoint1->colocated) )               // w1 = pol.dk074@A1_N
     if (w1->route != r)                            // diff route; proceed
      for ( Waypoint* w2 : *(s->waypoint2->colocated) )             // w2 = pol.dk074@A1_S
       if (w1->route == w2->route)                      // same route; proceed
       {HighwaySegment* other = w1->route->find_segment_by_waypoints(w1,w2);    // other = [POL DK74 A1_S A1_N]
        if (other)                              // proceed
         if (!s->concurrent)                            // assigned a value in prev pass; skip
      //...                                 //
         else if (!contains(*s->concurrent, other))             // not in list; proceed
         {other->concurrent = s->concurrent;                    // add list to DK74
          s->concurrent->push_back(other);                  // add DK74 to list
          //concurrencyfile << "Extended concurrency " etc.         // 
         }
       }
E67 S E67 N E75 A1 S8 S S8 N DK74
E67 S 0 0 E67 S 0 0 E67 S
A1 A1 A1
DK74 DK74 DK74
yakra commented 3 years ago
                                        // s = [POL E67 29(S8) 27(A1)]
    for ( Waypoint* w1 : *(s->waypoint1->colocated) )               // w1 = pol.e67@26(A1), pol.e67@29(S8)
     if (w1->route != r)                            // same route; skip
E67 S E67 N E75 A1 S8 S S8 N DK74
E67 S 0 0 E67 S 0 0 E67 S
A1 A1 A1
DK74 DK74 DK74

OK, I think I see where this is going... We probably want to proceed if (w1 != s->waypoint1), and detect POL E67 27(A1) 26(A1) when testing POL E67 29(S8) 27(A1). I'll continue down this rabbit hole in the morning & make sure I'm on the right track.

yakra commented 3 years ago

We continue on with POL E67 29(S8) 27(A1)... w1 = pol.e75@26(A1); w2 = pol.e75@27(A1); add list to E75; add E75 to list. w1 = pol.s008@26(A1); w2 = pol.s008@27(A1); add list to S8 S; add S8 S to list. w1 = pol.s008@29; w2 = pol.s008@27(A1); add list to S8 N; add S8 N to list.

E67 S E67 N E75 A1 S8 S S8 N DK74
E67 S 0 E67 S E67 S E67 S E67 S E67 S
A1 A1 A1 A1 A1 A1
DK74 DK74 DK74 DK74 DK74 DK74
E75 E75 E75 E75 E75 E75
S8 S S8 S S8 S S8 S S8 S S8 S
S8 N S8 N S8 N S8 N S8 N S8 N

We're done checking POL E67 29(S8) 27(A1). Time to check POL E67 27(A1) 26(A1)...


Why is PA different?

The first route processed is I-376, not the self-concurrent one. All other routes pass the if (w1->route != r) test. Once we're done processing I-376, the list is fully built out, with all segments pointing to it. When we move on past I-376 to process the rest of the routes, there's nothing left to do.

yakra commented 3 years ago
  for (HighwaySegment* s : r->segment_list)                 // s = [POL E67 27(A1) 26(A1)]
   if (s->waypoint1->colocated && s->waypoint2->colocated)          // both colocated; proceed
    for ( Waypoint* w1 : *(s->waypoint1->colocated) )               // w1 = pol.a001@27
     if (w1->route != r)                            // diff route; proceed
      for ( Waypoint* w2 : *(s->waypoint2->colocated) )             // w2 = pol.a001@26
       if (w1->route == w2->route)                      // same route; proceed
       {HighwaySegment* other = w1->route->find_segment_by_waypoints(w1,w2);    // other = [POL A1 26 27]
        if (other)                              // proceed
         if (!s->concurrent)                            // s->concurrent still 0; proceed
         {s->concurrent = new list<HighwaySegment*>;                // a 2nd list is created
          other->concurrent = s->concurrent;                    // [POL A1 26 27]'s list overwritten
          s->concurrent->push_back(s);                      // E67 N & A1 get E67 N
          s->concurrent->push_back(other);                  // E67 N & A1 get A1
          //concurrencyfile << "New concurrency [" etc.             // 
         }                                  // 
         else //...                             // 
       }
E67 S E67 N E75 A1 S8 S S8 N DK74
E67 S E67 N E67 S E67 N E67 S E67 S E67 S
A1 A1 A1 A1 A1 A1 A1
DK74 DK74 DK74 DK74 DK74
E75 E75 E75 E75 E75
S8 S S8 S S8 S S8 S S8 S
S8 N S8 N S8 N S8 N S8 N
yakra commented 3 years ago
                                        // s = [POL E67 27(A1) 26(A1)]
    for ( Waypoint* w1 : *(s->waypoint1->colocated) )               // w1 = pol.dk074@A1_S
     if (w1->route != r)                            // diff route; proceed
      for ( Waypoint* w2 : *(s->waypoint2->colocated) )             // w2 = pol.dk074@A1_N
       if (w1->route == w2->route)                      // same route; proceed
       {HighwaySegment* other = w1->route->find_segment_by_waypoints(w1,w2);    // other = [POL DK74 A1_S A1_N]
        if (other)                              // proceed
         if (!s->concurrent)                            // list exists since we found A1; skip
          //...                                 // 
         else if (!contains(*s->concurrent, other))             // not in list; proceed
         {other->concurrent = s->concurrent;                    // [POL DK74 A1_S A1_N]'s list overwritten
          s->concurrent->push_back(other);                  // add DK74 to list
          //concurrencyfile << "Extended concurrency " etc.         // 
         }
       }
E67 S E67 N E75 A1 S8 S S8 N DK74
E67 S E67 N E67 S E67 N E67 S E67 S E67 N
A1 A1 A1 A1 A1 A1 A1
DK74 DK74 DK74 DK74 DK74 DK74 DK74
E75 E75 E75 E75
S8 S S8 S S8 S S8 S
S8 N S8 N S8 N S8 N
yakra commented 3 years ago
                                        // s = [POL E67 27(A1) 26(A1)]
    for ( Waypoint* w1 : *(s->waypoint1->colocated) )               // w1 = pol.e67@27(A1)
     if (w1->route != r)                            // same route; skip

We continue on with [POL E67 27(A1) 26(A1)] as above and as above... w1 = pol.e75@27(A1); w2 = pol.e75@26(A1); ~add list to~ overwrite list in E75; add E75 to list. w1 = pol.s008@27(A1); w2 = pol.s008@26(A1); ~add list to~ overwrite list in S8 S; add S8 S to list. w1 = pol.s008@27(A1); w2 = pol.s008@29; ~add list to~ overwrite list in S8 N; add S8 N to list.

E67 S E67 N E75 A1 S8 S S8 N DK74
E67 S E67 N E67 N E67 N E67 N E67 N E67 N
A1 A1 A1 A1 A1 A1 A1
DK74 DK74 DK74 DK74 DK74 DK74 DK74
E75 E75 E75 E75 E75 E75 E75
S8 S S8 S S8 S S8 S S8 S S8 S S8 S
S8 N S8 N S8 N S8 N S8 N S8 N S8 N

At this point the 7 concurrent segments share 2 concurrency lists, each listing 6 segments:

We're done checking POL E67. Time to check POL E75...

yakra commented 3 years ago

The A1 & DK74 segments are already contained in POL E75 26(A1) 27(A1)'s concurrency list; nothing happens there. Moving on...

 for (Route* r : h->route_list)                         // r = pol.e75
  for (HighwaySegment* s : r->segment_list)                 // s = [POL E75 26(A1) 27(A1)]
   if (s->waypoint1->colocated && s->waypoint2->colocated)          // both colocated; proceed
    for ( Waypoint* w1 : *(s->waypoint1->colocated) )               // w1 = pol.e67@26(A1)
     if (w1->route != r)                            // diff route; proceed
      for ( Waypoint* w2 : *(s->waypoint2->colocated) )             // w2 = pol.e67@27(A1)
       if (w1->route == w2->route)                      // same route; proceed
       {HighwaySegment* other = w1->route->find_segment_by_waypoints(w1,w2);    // other = [POL E67 27(A1) 26(A1)]
        if (other)                              // proceed
         if (!s->concurrent)                            // list exists; skip
          //...                                 // 
         else if (!contains(*s->concurrent, other))             // not in the new list yet; proceed
         {other->concurrent = s->concurrent;                    // [POL E67 27(A1) 26(A1)]'s list overwritten
          s->concurrent->push_back(other);                  // add E67 S to list
          //concurrencyfile << "Extended concurrency " etc.         // 
         }
       }
And there's the memory leak. The last pointer to the original concurrency list is overwritten, making it inaccessible. Python doesn't care about this, because garbage collection. In C++, this should clearly be fixed. E67 S E67 N E75 A1 S8 S S8 N DK74
E67 N E67 N E67 N E67 N E67 N E67 N E67 N
A1 A1 A1 A1 A1 A1 A1
DK74 DK74 DK74 DK74 DK74 DK74 DK74
E75 E75 E75 E75 E75 E75 E75
S8 S S8 S S8 S S8 S S8 S S8 S S8 S
S8 N S8 N S8 N S8 N S8 N S8 N S8 N
E67 S E67 S E67 S E67 S E67 S E67 S E67 S

All 7 segments point to the new list, which has all 7 segments on it. The 2nd segment of E67 is counterintuitively at the end of the list, but it makes sense -- with E67 first in line for processing, it doesn't get picked up as concurrent with itself. The 2nd E67 segment isn't added to the list until we process another route it can be concurrent with. This is borne out in the HighwayGraph structure: load up the graph in the HDX and you'll see E67,A1,DK74,E75,S8,S8,E67 as the edge name.


The proposal this time is a bit different from that in #137; code should be leaner & cleaner, smaller diff overall, and should probably run a tiny bit faster. The results are the same, though -- self-concurrencies will be detected even for cases like UT190 where there's no other designation in the mix. As it should be, IMO; this keeps it consistent with how concurrencies are done in all other scenarios. Might be time to dust off https://forum.travelmapping.net/index.php?topic=2805 Making this change would beg for conforming changes in Python.

yakra commented 3 years ago

Graph Generation

https://github.com/TravelMapping/DataProcessing/blob/4d988b326b2c2069a71b43177c4c1721859ac0dd/siteupdate/cplusplus/classes/GraphGeneration/HGEdge.cpp#L23-L34

https://github.com/TravelMapping/DataProcessing/blob/3baef90ee431f4f3e85a39841e04986729288d3e/siteupdate/python-teresco/siteupdate.py#L1812-L1819

The graph generation process has changed since this check was first implemented. It's not as necessary as it once was. It was originally needed for US19TrkPit & pals, and probably written in response to exactly that if the timing of the original commit and Jim's post here are any indication. Before #136, segments with n>2 concurrent routes would have n-1 concurrency lists between them. In cases with self-concurrent routes, some of these lists would be incomplete. I explored the mechanics behind this in #135. The final concurrency lists for US19TrkPit & pals are in the 2nd table in this post. Back then, the HighwayGraph constructor would visit each segment in turn, check whether it's marked as visited, and if not, mark it for graph edge construction then mark all concurrent segments visited. I-376 got visited first, with its incomplete concurrency list, and the southern segment of US19TrkPit was not marked visited. Later on, US19TrkPit itself was visited, and marked for graph construction. Hence the need for the graph constructor to check for "reversed" edges. Now, since #135, all segments share the same concurrency list, fully populated. An edge is constructed for I-376, and we don't go on & attempt to construct another one for PA US19TrkPit I-376(69B)_S I-376(69A).

This check is still necessary for cases like UT190 when a route is concurrent with itself but no other routes. Neither UT UT190 UT190_A UT190_B nor UT UT190 UT190_C UT190_D get marked concurrent at all. After making the fix for this issue, this code can go away.

yakra commented 3 years ago

People who've marked both segments will see:

People who've marked only 1 segment will see:

yakra commented 3 years ago

penultimate rebase on BiggaTomato: 0313d7ba4c645887b1c9ba48ad1a8f6199f64a42