Closed cloneofghosts closed 1 month ago
!!! Seeing these errors on my end now as well !!! Investigating now
I'm guessing the investigation is causing the API to return a Internal Service Error?
Yea- what's happening is for some strange reason, I'm getting connection reset errors getting data from S3. Usually this isn't too much of an issue, they just repeat. However, the number of errors increased to the point it's not recovering on its own, so writing some error catching code
Ok, prod is back up (with new checks on ingest to fail gracefully now)
Was just about to make a comment that prod is back up but you beat me to it. So is this issue something that will sort itself out on its own?
Also seeing the same issue that you fixed this morning where I'm getting a mix of 2.0.5 and 2.0.6.
Yea- I restarted one container to addressed that -86400 issue, but waiting on NBM being ingested again before touching anything else :p
@alexander0042 Don't know if it's related to this issue or something else but I started seeing "precipType": -999,
in the currently and minutely sections. The minutely summary and icon are also broken
"minutely": {
"summary": -999,
"icon": -999,
EDIT: I see NBM is fixed but HRRR disappeared.
Ok, isolated the NBM Issue to an ingest file that failed in a way I hadn't thought of. Corrected now and everything seems to be moving correctly now
Yup, NBM seems to be working again though it looks like HRRR_subh seems to have disappeared.
Win some lose some... fixing it now
@alexander0042 NBM seems to have gotten stuck again as the last update was the 15Z run from yesterday.
Also NBM fire seems to have gotten stuck as well.
Yea, either NOAA or AWS is having issues moving the files over from one side to the other for NBM today. This created a ton of issues with updating, since the data was partially there and I'd assumed it would either all be there or none of it. Regardless, good reason to improve the code, since it gave me a reason to add some additional error checking and NOMADS fallback!
Good news is that the AWS bucket seems to be re-populating now, just as I finished the fallback plan, so it should be updating again shortly, as well as more resilient in the future.
Can confirm that its updating again. I'll leave this open for a day or two to make sure that things are working before closing.
Things seem to be working so I'll close this.
Describe the bug
I noticed yesterday that the NBM model has stopped integrating with the last update being the 11Z on 2024-05-13. I know sometimes the data stops integrating for a few hours but then fixes itself. I checked this morning and I'm seeing the same time in the
sourceTimes
section so it appears something is broken.I checked the status page that was linked in #191 and I see nothing for the NBM model so I suspect the issue is on PW's end? The NBM Fire model is integrating without any issues though I know that its separate from the main NBM model.
Expected behavior
NBM data should be integrating
Actual behavior
NBM data stopped integrating with the last update being 2024-05-13 11Z
API Endpoint
Production
Location
Ottawa, Ontario
Other details
No response
Troubleshooting steps