Closed yakra closed 2 months ago
This all sounds good to me. Should we go live?
Yes! Let's go live before I get any more crazy ideas like making a string out of the invalid chars to go in the Info
column.
Don't forget https://github.com/TravelMapping/Web/pull/797.
While I have your attention, any thoughts on https://github.com/TravelMapping/DataProcessing/pull/640#issuecomment-2359975548?
@michihdeu referenced this pull request in https://github.com/TravelMapping/RailwayData/commit/a31f435952c05057889b57c1d9979e57288eb720
It appears I was tagged there, but for some reason my browser isn't rendering the page. :( What did I miss?
https://github.com/TravelMapping/RailwayData/pull/187#issuecomment-2323107288 This does run a tiny tiny bit faster on BiggaTomato. Not making any pretty pictures this time though. :) There will be another pull request in the Web repo to accompany this one.
This ended up being a bit of a rabbit hole. In a good way.
Among the invalid characters prohibited from labels are control characters. The chances of these appearing in a .wpt are pretty low, but I decided to go the extra mile to handle them just in case. We can't put these in an ErrorList message because attempting to print them to the terminal could make stuff go boom. Truncating the label or skipping or replacing the offending characters sounds good at first, but that can leave us with no info on what the offending label is or where to find it in the file. For a more robust solution, I opted to show the offset in .wpt file. If the :poop: hits the fan, load the file into your favorite hex editor and have fun. It works, even if it's not convenient, and was pretty simple to code.
Doing this involved moving the LABEL_INVALID_CHAR check into the Waypoint constructor, where we have the info needed to calculate the offset on hand. This is where things got interesting.
Looking at the results of the 1st draft, I saw:
diff _37c5/logs/datacheck.log logs/datacheck.log
Let's take a look at the file:
Of course, comments are not supported in .wpt files. So what's really happening here?
AveCal
is the primary label.http://www.openstreetmap.org/?lat=31.902864&lon=-116.541553
looks like a URL, but is really the first AltLabel.#name assumed from Google
looks like a comment, but is really 4 separate AltLabels.Maps
looks like a comment, but is really a malformed URL....And sure enough, in datacheck.log:
mexbc.mex003ens;AveCal;;;MALFORMED_URL;MISSING_ARG(S)
Maybe not the most helpfulInfo
text. I could see someone asking, "What the hell? It's a perfectly fine URL, nothing missing!" So I've changed MALFORMED_URL to read like:mexbc.mex003ens;AveCal;;;MALFORMED_URL;Maps
mexbc.mex003ens;Ruiz;;;MALFORMED_URL;Junction
...More helpful to see what it is that siteupdate's trying to parse as the URL. It'll only say "MISSING_ARG(S)" now if the string is too long to fit in the DB, or has characters that shouldn't go in it like backslashes, quotes or control chars.Why weren't the new errors flagged before? Before, there was a
Waypoint::label_invalid_char()
function, called on complete Waypoint objects. Problem was, without a valid URL, there's no way to get valid coords, so the line gets thrown out, Waypoint never created. Now? Errors get flagged as label strings get validated right from the originalwptdata
buffer, before the URL is parsed, before a Waypoint is created/thrown out.Altogether, the new and changed error reports should give a better indication what's going on with mexbc.mex003ens.