Closed cdwhitt closed 9 months ago
@idreyn Please review this when you have a few minutes as this goes back to our discussion from Thursday. Feel free to edit the description.
I think we should actually consider generating an all_slow.json
for every day in history — I think there'd be too much data to squeeze into that json file if we added a time series for every slow zone.
I'm missing some context here, but one file per day would require reading a lot of files to do timeseries displays. Better would be one file per slowzone - but it's tricky because they can change in duration as our baseline drifts around. And we'd want 2 weeks context before or after. Maybe we're not talking about per-slowzone timeseries displays though, the traveltime graphs do that pretty well.
On Mon, Feb 5, 2024 at 11:48 AM Ian Reynolds @.***> wrote:
I think we should actually consider generating an all_slow.json for every day in history — I think there'd be too much data to squeeze into that json file if we added a time series for every slow zone.
— Reply to this email directly, view it on GitHub https://github.com/transitmatters/data-ingestion/issues/74#issuecomment-1927450287, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACA26AROMNZHC6UTWFPKHTTYSEEOZAVCNFSM6AAAAABC2IL2DCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRXGQ2TAMRYG4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I'm missing some context here, but one file per day would require reading a lot of files to do timeseries displays. Better would be one file per slowzone - but it's tricky because they can change in duration as our baseline drifts around. And we'd want 2 weeks context before or after. Maybe we're not talking about per-slowzone timeseries displays though, the traveltime graphs do that pretty well. … On Mon, Feb 5, 2024 at 11:48 AM Ian Reynolds @.> wrote: I think we should actually consider generating an all_slow.json for every day in history — I think there'd be too much data to squeeze into that json file if we added a time series for every slow zone. — Reply to this email directly, view it on GitHub <#74 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACA26AROMNZHC6UTWFPKHTTYSEEOZAVCNFSM6AAAAABC2IL2DCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRXGQ2TAMRYG4 . You are receiving this because you are subscribed to this thread.Message ID: @.>
We had discussed creating a dynamo table for this data, though I think that would incur a cost.
I'm missing some context here, but one file per day would require reading a lot of files to do timeseries displays.
I think that's true, but I also think we are already doing all of those reads in analyze_for_slow()
. We'd just need to hold on to the un-averaged travel time and slow time a little longer.
We had discussed creating a dynamo table for this data, though I think that would incur a cost.
We tend to be pretty liberal about spinning up Dynamo tables without worrying about their cost!
Closing this as it is needed here instead: https://github.com/transitmatters/slow-zones/issues/43
We need to simplify the
ByDirection<SlowZonesResponse[]>
toByDirection<SlowZoneResponse>
, assuming there's only one active slow zone per track segment. This requires verification by reviewing the code that generatesall_slow.json
The current
all_slow.json
doesn't contain historical values for every single day but aggregates over multiple days, which might lead to inaccuracies in duration for selected days.