Open lenaschimmel opened 4 years ago
We probably found the reason for this bug:
Normally the primary key of the predictions
table should prevent multiple db rows for the same departure event. Our insert or replace is based on this assumption. However, the primary key contains both the route_id
and trip_id
column.
For the VBN data source, we often get schedule updates where the same actual trip keeps the same route_id
, but get another trip_id
. We make a scheduled prediction with the old trip_id
, and a week later, we make a real time prediction with the new trip_id
.
On the stop page, we use the local is_duplicate
function inside generate_stop_page
to filter out the old, schedule based prediction.
On the trip page, we use get_prediction_for_first_line
and already filter by source
, event_type
, stop_sequence
, trip_id
, trip_start_date
and trip_start_time
. Here, we can't catch the row with the deviating trip_id
.
This could be solved in the monitor
module, but it might be easier / cleaner to catch it in the predictor
, or actually in the importer
, since predictions are made during import.
This could be a separate fix, but would also be done if #10 was implemented.
Now that #10 is done, this issue persists. We could verify that for practically all buses in Brunswick, we get predictions with matching scheduled times and different trip_id
s. We still believe that this occurs when a new version of the schedule is downloaded.
For some trips, a high-quality prediction is shown on the stop page (e.g. "E/S") but on the matching trip page, a prediction with lower precision (e.g. "P/S-") is shown.
High precision:
Low precision:
We don't know yet when/why it is happening.