TravelMapping / DataProcessing

Data Processing Scripts and Programs for Travel Mapping Project
4 stars 6 forks source link

generate .nmp files along with graphs #211

Open yakra opened 5 years ago

yakra commented 5 years ago

June 16, 2017, @michih wrote:

I think about a concurrency check during the site update process to reduce the number of .nmp file entries.

Bad redundancy.

1 pair of unique point locations

me.us002@ME139 44.712893 -69.788732 FP
me.us002@ME8_S 44.7129 -69.7892 FP

balloons out to 9 point pairs, because each location in this example has 3 colocated points:

me.me008@US2_E 44.7129 -69.7892 FP
me.me139@US2 44.712893 -69.788732 FP
me.me008@US2_E 44.7129 -69.7892 FP
me.us002@ME139 44.712893 -69.788732 FP
me.me008@US2_E 44.7129 -69.7892 FP
me.us201altsko@ME139 44.712893 -69.788732 FP
me.me139@US2 44.712893 -69.788732 FP
me.us002@ME8_S 44.7129 -69.7892 FP
me.me139@US2 44.712893 -69.788732 FP
me.us201altsko@ME8_S 44.7129 -69.7892 FP
me.us002@ME139 44.712893 -69.788732 FP
me.us002@ME8_S 44.7129 -69.7892 FP
me.us002@ME139 44.712893 -69.788732 FP
me.us201altsko@ME8_S 44.7129 -69.7892 FP
me.us002@ME8_S 44.7129 -69.7892 FP
me.us201altsko@ME139 44.712893 -69.788732 FP
me.us201altsko@ME139 44.712893 -69.788732 FP
me.us201altsko@ME8_S 44.7129 -69.7892 FP

Just having 2 lines in the NMP file would be cleaner and more usable, but that would break the nmpbyregion filter as it's used now. A solution is to create .nmp files during graph generation.

Advantages:

A few of the variables & functions already in place will be helpful; this shouldn't be too terribly difficult.

This would address part of #150.

nearmisspoints.log, nmpfps.log, and nmp_merged .wpt files can otherwise stay as they are, though there are possibilities for improving those in the future too.

yakra commented 5 years ago

Progress:

For now, the existing tm-master.nmp is still written the old way, for comparison purposes, in logs/. The new one is written in graphs/, and is looking good. Only a few differences:

[LOOKS INTENTIONAL] pairs: fixed a bug where pairs were flagged LI inappropriately.

[MARKED FP] pairs are a little more robust. Originally, nor.harvtv@+X381735 60.421426 7.210926 nor.harvtv@+X885397 60.421458 7.210937 was not flagged FP, because that would require an entry in nmpfps.log for nor.harvtv@+X381735, which we don't have. The new way checks for FP entries for any colocated point at the first listed vertex (which we have for nor.rv007 +X381735), yielding nor.harvtv@+X381735 60.421426 7.210926 FP nor.harvtv@+X885397 60.421458 7.210937 FP

Next up, replace the traditional "root@label" style point labels with vertex labels from the graph structure. More useful IMO.

ToDo: Small changes to Route::write_nmp_merged were required. Gotta make sure I didn't break anything.

yakra commented 5 years ago

The new graphs/tm.master.tmg has 3150 lines, compared to 8130 lines in the old logs/tm-master.tmg. Still a bit laggy, but considerably faster in the HDX.

yakra commented 5 years ago

Time difference:

C++ on BiggaTomato:

Edit: Total graph (including .nmp) generation time takes 4 - 7.3 s, 6 s average, longer. This is without a fix for missing border NMPs as described below.

yakra commented 5 years ago

Bug? ME113Cha@NH113B_N&NH113B@ME113_N 44.201084 -71.00583 FP ME113Fry/ME113Cha@ME/NH 44.20086 -71.00575 FP is included in NH-region.nmp, but not ME-region.nmp. This is because the "primary" point, the one whose vertex name comes 1st asciibetically, is in NH rather than in ME or on the border (IE, in both regions). A workaround should be fairly easy, but could slow things down a tiny bit depending on how it's done.

yakra commented 5 years ago

Oh boy. I've spent a whole day working on this, before realizing that NMP pairs with both points only in devel systems won't be detected... :(

Edit: Stashing these changes on a separate branch. Going to leave this alone for a while until I can figure out a sensible way to be sure devel-only points are included.

It seems a little kludgey, but the beginnings of one idea are to set aside an unordered_set of devel-only points here, for processing later on...

yakra commented 4 years ago

Despite the mention in #285, this is still not a big priority for me.