yakra / DataProcessing

Data Processing Scripts and Programs for Travel Mapping Project
0 stars 0 forks source link

user log diffs #45

Closed yakra closed 4 years ago

yakra commented 5 years ago

jerseyman4.list ASCII text, with CRLF line terminators

1c1
< Log file created at: 2019-01-11 02:56:19.683818
---
> Log file created at: Fri Jan 11 03:44:42 2019
6c6
< Note: deprecated route name I-270_SPUR -> canonical name I-270SprAsh in line MD I-270_SPUR I-495 2
---
> Note: deprecated route name I-270_SPUR -> canonical name I-270SprAsh in line MD I-270_SPUR I-495 2 
22c22
< Note: deprecated route name I-840FutWGr -> canonical name I-840 in line NC I-840FutWGr 1 3
---
> Note: deprecated route name I-840FutWGr -> canonical name I-840 in line NC I-840FutWGr 1 3 
50c50
< Note: deprecated route name US72_E -> canonical name US72Cha in line TN US72_E AL/TN I-24(152)
---
> Note: deprecated route name US72_E -> canonical name US72Cha in line TN US72_E AL/TN I-24(152) 

Spaces (?) at end of lines in C++ version https://github.com/yakra/DataProcessing/blob/652efe3135ff6e65520a05c371d1a4dcce339c90/siteupdate/cplusplus/classes/TravelerList/TravelerList.cpp#L91-L96 endchar is in amongst a sea of \0s, not whitespace; never trimmed. WontFix.

yakra commented 5 years ago

lowenbrau.list data

1c1
< Log file created at: 2019-01-11 02:56:19.536042
---
> Log file created at: Fri Jan 11 03:44:42 2019
52,53c52
< Waypoint label(s) not found in line: ON ON405 ON/USA PorRd
< Processed 1686 good lines marking 21054 segments traveled.
---
> Processed 1687 good lines marking 21056 segments traveled.
55,56c54,55
< Overall in active systems: 30414.33 of 881553.84 mi (3.5%)
< Overall in active+preview systems: 30414.51 of 1203407.92 mi (2.5%)
---
> Overall in active systems: 30414.83 of 881553.84 mi (3.5%)
> Overall in active+preview systems: 30415.01 of 1203407.92 mi (2.5%)
91a91
> ON: 0.50 of 7509.95 mi (0.0%), 0.50 of 7509.95 mi (0.0%)
124c124,128
< System canonf (active) overall: 0.00 of 1244.18 mi (0.0%)
---
> System canonf (active) overall: 0.50 of 1244.18 mi (0.0%)
> System canonf by route (traveled routes only):
> ON405: 0.50 of 5.93 mi (8.5%)
>  (ON ON405 only)
> System canonf connected routes traveled: 1 of 19 (5.3%), clinched: 0 of 19 (0.0%).
3875c3879
< Traveled 47 of 225 (20.9%), Clinched 0 of 225 (0.0%) active systems
---
> Traveled 48 of 225 (21.3%), Clinched 0 of 225 (0.0%) active systems

https://github.com/TravelMapping/DataProcessing/issues/41#issuecomment-450941199 The final line, ON ON405 ON/USA PorRd, is followed by 1486 null \0s, before a final CRLF. This will require a fix in siteupdate.py.

yakra commented 5 years ago

neilbert.list ASCII text, with CRLF line terminators

1c1
< Log file created at: 2019-01-11 02:56:19.492385
---
> Log file created at: Fri Jan 11 03:44:42 2019
5c5
< Incorrect format line: MEX MEX307 MEX180D (Tulum) *
---
> Unknown region/highway combo in line: MEX MEX307 MEX180D (Tulum) *

C++: the * gets destroyed by strtok, resulting in 4 fields, and a correctly formatted line which we then attempt to parse.

As an unprocessed line in both cases, I'm not too hung up on fixing it, but if I want to, some old code (Travelerlist.cpp?) should still be kicking around that doesn't strip * via strtok.

yakra commented 5 years ago

ntallyn.list ASCII text, with CRLF line terminators

1c1
< Log file created at: 2019-01-11 02:56:19.723202
---
> Log file created at: Fri Jan 11 03:44:42 2019
8c8
< Incorrect format line: LA LA3152
---
> Incorrect format line: LA LA3152 
11c11
< Unknown region/highway combo in line: *SC SC11 (?) I-85*
---
> Waypoint label(s) not found in line: *SC SC11 (?) I-85*

Incorrect format line: LA LA3152 extra space at end in C++ version

*SC SC11 (?) I-85* Python: No *SC region, ergo Unknown region/highway combo C++: * stripped by strtok -> SC SC11 (?) I-85 -> Waypoint label (?) not found Not too fussed about fixing this either; an invalid line is an invalid line. Ideally, this would just be a .list file comment.

yakra commented 5 years ago

rebelgtp.list ASCII text, with CRLF line terminators

1,2c1,2
< Log file created at: 2019-01-11 14:21:07.403324
< Incorrect format line: KY I-71
---
> Log file created at: Fri Jan 11 14:15:18 2019
> Incorrect format line: KY I-71 
5c5
< Incorrect format line: CA
---
> Incorrect format line: CA 

Space(s) at end of line again. See first post. WontFix.

yakra commented 5 years ago

spinoza.list ASCII text, with CRLF line terminators

1,2c1,2
< Log file created at: 2019-01-11 14:21:07.374633
< Unknown region/highway combo in line: **DEU-BW L173 B500_S AllStr_W
---
> Log file created at: Fri Jan 11 14:15:17 2019
> Waypoint label(s) not found in line: **DEU-BW L173 B500_S AllStr_W

**DEU-BW L173 B500_S AllStr_W Python: No **DEU-BW region, ergo Unknown region/highway combo C++: * stripped by strtok -> DEU-BW L173 B500_S AllStr_W -> Waypoint label AllStr_W not found Not too fussed about fixing this either (or am I?); an invalid line is an invalid line. Ideally, this would just be a .list file comment.

yakra commented 5 years ago

valedc03ls.list ASCII text, with CRLF line terminators

1c1
< Log file created at: 2019-01-11 14:21:07.230943
---
> Log file created at: Fri Jan 11 14:15:17 2019
32c32
< Unknown region/highway combo in line: DEU A115 1 9
---
> Unknown region/highway combo in line: DEU A115 1 9 

Space(s) at end of line again. See first post. WontFix.

yakra commented 5 years ago
list file disposition
jerseyman4 WontFix
lowenbrau Done: https://github.com/TravelMapping/DataProcessing/pull/176
neilbert Treat as 5 fields? (No. Consider * as whitespace.)
ntallyn WontFix
rebelgtp WontFix
spinoza WontFix?
https://github.com/TravelMapping/HighwayData/commit/d3cc92b45264a8ae0a7bfb1fd08a5faef73ad7a5#diff-920c2aec98c9258fb3e02488d783253a
https://github.com/TravelMapping/UserData/commit/26e0653380fb5a35db753e62fbe494b1f1307bc6#diff-3d4c8a05cff360be33292fd05167dbcf
valedc03ls WontFix
yakra commented 5 years ago

oscar

diff -r /home/yakra/TravelMapping/yakra/DataProcessing/siteupdate/python-teresco/logs/users/oscar.log /home/yakra/TravelMapping/yakra/DataProcessing/siteupdate/cplusplus/logs/users/oscar.log
7660c7660
< ME103: 7.99 of 15.52 mi (51.4%)
> ME103: 7.98 of 15.52 mi (51.4%)

Python and C++ are known to produce different floating-point values in the less significant digits. Assuming the difference here straddles that magical 7.985 mi value, but looking into this in more detail.

Edit: Grabbed coords from WPTedit, and used LibreOffice to recreate the steps of the Waypoint::distance_to(Waypoint *other) function. The result for Oscar's mileage on ME103 is 7.98499996119456 mi, so it looks like that's exactly what's happening. A freak of statistics & rounding.

Fix: double rlat1 = lat * pi/180; -> double rlat1 = lat * (pi/180);

yakra commented 4 years ago

Reconsider items flagged "wontfix".

neilbert.log 1c1 < Unknown region/highway combo in line: MEX MEX307 MEX180D (Tulum) *

Incorrect format line: MEX MEX307 MEX180D (Tulum) *

ntallyn.log 6c6 < Waypoint label(s) not found in line: SC SC11 (?) I-85

Unknown region/highway combo in line: SC SC11 (?) I-85

spinoza.log 1c1 < Waypoint label(s) not found in line: **DEU-BW L173 B500_S AllStr_W

Unknown region/highway combo in line: **DEU-BW L173 B500_S AllStr_W

yakra commented 4 years ago

A pending fix for https://github.com/TravelMapping/DataProcessing/issues/41 will close this.