Sadless74 / googletransitdatafeed

Automatically exported from code.google.com/p/googletransitdatafeed
0 stars 0 forks source link

Non-increasing shape_dist_traveled must not always be a fatal validation error #203

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,

r1190 added a check to feedvalidator that
stop_times.txt/shape_dist_traveled is strictly increasing; if not, that is
a fatal error.  While validating some feeds from the real world with this
new code, I found that this check is probably too strict:

- There are feeds that put a non-empty value into
stop_times.txt/shape_dist_traveled even if the trip has no shape at all.  
Consumers of the data will, in all likelyhood, simply ignore it, so a
warning seems strict enough.

- There are feeds in which two successive stop-times have the same value of
shape_dist_traveled.  For this case of equality, I'd also like to suggest
demoting the error to a warning, because it can happen as a side-effect of
limited precisison and is not too hard to handle for consumers of the data,
and I saw it for a number of feeds that otherwise seemed ok.

While investigating all this, I detected a shortcoming of the corresponding
error message (in _transitfeed.py search "has shape_dist_travled"(sic!)
near line 1926): it rounds this stop's shape_dist_traveled to fewer digits
than the previous one.  That may make it appear larger, in which case the
message becomes paradoxical.  It's tricky to provide good user experience
here short of switching to package decimal to preserve actual input numbers
-- maybe the best fix is to report previous value, current value and their
difference in a floating-point format.

Original issue reported on code.google.com by arnoegw.code@gmail.com on 25 Nov 2009 at 4:01

GoogleCodeExporter commented 9 years ago
San Diego's feed gets this error as well, "Invalid value 5.5481061 in field 
stoptimes.shape_dist_traveled. For the trip 6316410 the stop 91040 has 
shape_dist_travled=5.548106, which should be larger than the previous ones. In 
this 
case, the previous distance was 5.5481061."

Rounding also causes this warning to appear incorrect:
"In stop_times.txt, the stop with trip_id=6375282 and stop_sequence=11 has 
shape_dist_traveled=8.790458, which is larger than the max 
shape_dist_traveled=8.790458 
of the corresponding shape (shape_id=530_1_12)"

Original comment by devin.br...@gmail.com on 25 Nov 2009 at 4:24

GoogleCodeExporter commented 9 years ago
I've had a second thought about my initial report.  Turning the error into a 
warning
for successive equal values is probably the more useful part: The dummy values 
I saw
for trips without shape were all constant (0), so they would stop to be errors 
as a
side effect of the other change.  Maybe that's good enough.

Original comment by arnoegw.code@gmail.com on 25 Nov 2009 at 5:47

GoogleCodeExporter commented 9 years ago

Original comment by quguangfan@gmail.com on 2 Dec 2009 at 10:30

GoogleCodeExporter commented 9 years ago
The bus is fixed with the requirement:
Warn if two successive stop-times have the same value of shape_dist_traveled.
Error if values of the shape_dist_traveled are decreasing.

Review at:
http://codereview.appspot.com/161062/

Addresses bug:
http://code.google.com/p/googletransitdatafeed/issues/detail?id=203

Revision:
http://code.google.com/p/googletransitdatafeed/source/detail?r=1222

Original comment by quguangfan@gmail.com on 3 Dec 2009 at 2:36