transitmatters / mbta-performance

For processing performance data for the data dashboard
MIT License
1 stars 1 forks source link

Pyarrow crashes when service is run before parquet is available #5

Closed devinmatte closed 7 months ago

devinmatte commented 7 months ago

When the service runs at 5am, before Parquet data would be available, the service crashes with

pyarrow.lib.ArrowInvalid: Could not open Parquet input source '<Buffer>': Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.

We should catch that and exit early when there isn't yet a parquet file for us to process

devinmatte commented 7 months ago

I think this is solved now https://app.datadoghq.com/apm/error-tracking/issue/813bc390-f586-11ee-9dd9-da7ad0900002?query=env%3Aprod%20service%3Ambta-performance&refresh_mode=paused&view=spans&from_ts=1713083503507&to_ts=1713173503507&live=false