yakra / DataProcessing

Data Processing Scripts and Programs for Travel Mapping Project
0 stars 0 forks source link

converting `HighwaySegment::clinched_by` to a bit field #221

Closed yakra closed 12 months ago

yakra commented 2 years ago

Once TravelerLists are in contiguous memory, try converting HighwaySegment::clinched_by to a bit field

Can the mutex be made more efficient?

Will this take up more or less RAM in practice? sizeof(std::unordered_set) is 40B in FreeBSD & 56B in Ubuntu. Plus whatever it allocates on the heap. 41 B for 328 users + 8 B for a pointer = 49 B. Sounds pretty efficient! :D

HighwaySegment destruction:

yakra commented 2 years ago

Things to do:

yakra commented 2 years ago

"Compressing collapsed edges" down ~30% on BT, ~40% on lab1 & lab2. Subgraphs slower. Total time way faster!

What's with the subgraphs? The only diff is when iterating thru e->segment->clinched_by -- without a set to iterate thru, we iterate thru the the bytes & bits for all TravelerLists, checking whether each bit is on. Add in a couplefew steps of pointer arithmetic to convert to a TravelerList*.

yakra commented 1 year ago

Things have progressed a bit since 5ffbfa7 in the "t3" branch. Rather than adding a bunch of new ad-hoc variables & functions in HighwaySegment, I've created a new iterable template <class item, class unit> class TMBitset that requires an array of item objects. Dereferencing an iterator returns an item*, meaning we can continue simple iteration via a range for loop. It may be possible to optimize TMBitset::iterator::operator++ for faster iteration.

yakra commented 1 year ago

Upon the reunification of compute_stats_r and compute_stats_t, the per-traveler part will iterate through an empty unordered_set. Hey, easy, no problem!

The TMBitset conversion will make it so that we need to iterate through a container full of zeros. Put another way, it'll be pretty expensive to find that foo.begin() == foo.end().


HighwaySegment construction:

yakra commented 12 months ago

Done in d9f0231. Merged in TravelMapping#592.

Some final comments & tying up loose ends:

Q: Can the mutex be made more efficient? A: This has its own issue now.

Q: Will this take up more or less RAM in practice? A: Less.

Q: HighwaySegment destruction: A: Needs to hang around long enough to do graph generation. KISS. When the stuff that owns the HighwaySegment and thus TMBitset is converted to TMArray, it all goes out of scope @ end of program & gets deleted.

Q: What's with the subgraphs? A: TMBitset Iteration is much more optimized. And cleaner, supporting range for loops.

Q: what would happen if I cast a pointer to a type with a different unit? A: Discussion here.

Q: Only allocate TMBitset space if active/preview? A: Nah. Only saves 901K. KISS and avoid the added complexity, however slight, of checking route->system->active_or_preview() and multiplying by the array size for every HighwaySegment constructed.