Inconsistent model source times

pantherman594 commented 1 month ago

Describe the bug

The same request switches between different model sourceTimes over the span of a few minutes, returning wildly different forecasts (e.g. 45% vs 18% chance of rain in the hourly forecast)

Expected behavior

Consistent use of models

Actual behavior

The first request (7:36 PM) uses nbm 2024-08-15 12Z and gfs 2024-08-15 06Z, with 45% chance of rain. The second request (7:38 PM) uses nbm 2024-08-15 15Z and gfs 2024-08-15 12Z, with 18% chance of rain. The third request (7:41 PM) uses the same models as the first, nbm 2024-08-15 12Z and gfs 2024-08-15 06Z with 45% chance of rain.

Request 1:

{
  "latitude": 42.<redacted>,
  "longitude": -71.<redacted>
  "timezone": "America/New_York",
  "offset": -4,
  "elevation": 28,
  "currently": {
    "time": 1723764960,
    "summary": "Partly Cloudy",
    "icon": "partly-cloudy-day",
    "nearestStormDistance": 12.32,
    "nearestStormBearing": 270,
    "precipIntensity": 0,
    "precipProbability": 0.58,
    "precipIntensityError": 1.67,
    "precipType": "none",
    "temperature": 19.55,
    "apparentTemperature": 20.34,
    "dewPoint": 17.49,
    "humidity": 0.86,
    "pressure": 1014.98,
    "windSpeed": 1.69,
    "windGust": 2.38,
    "windBearing": 39,
    "cloudCover": 0.65,
    "uvIndex": 0.09,
    "visibility": 16.09,
    "ozone": 348.82
  },
  "minutely": {
    "summary": "Clear",
    "icon": "clear",
    "data": [ ... ]
  },
  "hourly": {
    "summary": "Partly Cloudy",
    "icon": "partly-cloudy-day",
    "data": [
      {
        "time": 1723762800,
        "icon": "rain",
        "summary": "Rain",
        "precipIntensity": 0.762,
        "precipProbability": 0.45,
        "precipIntensityError": 1.0466,
        "precipAccumulation": 0.0762,
        "precipType": "rain",
        "temperature": 21.73,
        "apparentTemperature": 23.6,
        "dewPoint": 19.04,
        "humidity": 0.85,
        "pressure": 1015.04,
        "windSpeed": 2,
        "windGust": 2.7,
        "windBearing": 350,
        "cloudCover": 0.63,
        "uvIndex": 0.09,
        "visibility": 16.09,
        "ozone": 349.87
      },
      ...
    ]
  },
  "daily": {
    "summary": "Partly Cloudy",
    "icon": "partly-cloudy-day",
    "data": [ ... ]
  },
  "alerts": [],
  "flags": {
    "sources": [
      "ETOPO1",
      "gfs",
      "gefs",
      "hrrrsubh",
      "hrrr_0-18",
      "nbm",
      "nbm_fire",
      "hrrr_18-48"
    ],
    "sourceTimes": {
      "hrrr_subh": "2024-08-15 21Z",
      "hrrr_0-18": "2024-08-15 21Z",
      "nbm": "2024-08-15 12Z",
      "nbm_fire": "2024-08-15 12Z",
      "hrrr_18-48": "2024-08-15 18Z",
      "gfs": "2024-08-15 06Z",
      "gefs": "2024-08-15 12Z"
    },
    "nearest-station": 0,
    "units": "si",
    "version": "V2.1"
  }
}

Request 2:

{
  "latitude": 42.<redacted>,
  "longitude": -71.<redacted>,
  "timezone": "America/New_York",
  "offset": -4,
  "elevation": 28,
  "currently": {
    "time": 1723765080,
    "summary": "Partly Cloudy",
    "icon": "partly-cloudy-day",
    "nearestStormDistance": 0,
    "nearestStormBearing": 0,
    "precipIntensity": 0,
    "precipProbability": 0.17,
    "precipIntensityError": 1.67,
    "precipType": "none",
    "temperature": 19.53,
    "apparentTemperature": 20.27,
    "dewPoint": 17.51,
    "humidity": 0.86,
    "pressure": 1014.96,
    "windSpeed": 1.71,
    "windGust": 2.4,
    "windBearing": 34,
    "cloudCover": 0.68,
    "uvIndex": 0.29,
    "visibility": 14.02,
    "ozone": 346.85
  },
  "minutely": {
    "summary": "Clear",
    "icon": "clear",
    "data": [ ... ]
  },
  "hourly": {
    "summary": "Partly Cloudy",
    "icon": "partly-cloudy-day",
    "data": [
      {
        "time": 1723762800,
        "icon": "partly-cloudy-day",
        "summary": "Partly Cloudy",
        "precipIntensity": 0.5081,
        "precipProbability": 0.18,
        "precipIntensityError": 1.0466,
        "precipAccumulation": 0.0508,
        "precipType": "rain",
        "temperature": 22.04,
        "apparentTemperature": 24.24,
        "dewPoint": 19.1,
        "humidity": 0.84,
        "pressure": 1015.04,
        "windSpeed": 1.6,
        "windGust": 2.6,
        "windBearing": 90,
        "cloudCover": 0.66,
        "uvIndex": 0.62,
        "visibility": 16.09,
        "ozone": 347.46
      },
      ...
    ]
  },
  "daily": {
    "summary": "Partly Cloudy",
    "icon": "partly-cloudy-day",
    "data": [ ... ]
  },
  "alerts": [],
  "flags": {
    "sources": [
      "ETOPO1",
      "gfs",
      "gefs",
      "hrrrsubh",
      "hrrr_0-18",
      "nbm",
      "nbm_fire",
      "hrrr_18-48"
    ],
    "sourceTimes": {
      "hrrr_subh": "2024-08-15 21Z",
      "hrrr_0-18": "2024-08-15 21Z",
      "nbm": "2024-08-15 15Z",
      "nbm_fire": "2024-08-15 12Z",
      "hrrr_18-48": "2024-08-15 18Z",
      "gfs": "2024-08-15 12Z",
      "gefs": "2024-08-15 12Z"
    },
    "nearest-station": 0,
    "units": "si",
    "version": "V2.1"
  }
}

Request 3:

{
  "latitude": 42.<redacted>,
  "longitude": -71.<redacted>,
  "timezone": "America/New_York",
  "offset": -4,
  "elevation": 28,
  "currently": {
    "time": 1723765260,
    "summary": "Partly Cloudy",
    "icon": "partly-cloudy-day",
    "nearestStormDistance": 14.05,
    "nearestStormBearing": 270,
    "precipIntensity": 0,
    "precipProbability": 0.59,
    "precipIntensityError": 1.67,
    "precipType": "none",
    "temperature": 19.51,
    "apparentTemperature": 20.25,
    "dewPoint": 17.55,
    "humidity": 0.86,
    "pressure": 1014.92,
    "windSpeed": 1.76,
    "windGust": 2.44,
    "windBearing": 27,
    "cloudCover": 0.65,
    "uvIndex": 0.09,
    "visibility": 16.09,
    "ozone": 348.67
  },
  "minutely": {
    "summary": "Clear",
    "icon": "clear",
    "data": [ ... ]
  },
  "hourly": {
    "summary": "Partly Cloudy",
    "icon": "partly-cloudy-day",
    "data": [
      {
        "time": 1723762800,
        "icon": "rain",
        "summary": "Rain",
        "precipIntensity": 0.762,
        "precipProbability": 0.45,
        "precipIntensityError": 1.0466,
        "precipAccumulation": 0.0762,
        "precipType": "rain",
        "temperature": 21.73,
        "apparentTemperature": 23.6,
        "dewPoint": 19.04,
        "humidity": 0.85,
        "pressure": 1015.04,
        "windSpeed": 2,
        "windGust": 2.7,
        "windBearing": 350,
        "cloudCover": 0.63,
        "uvIndex": 0.09,
        "visibility": 16.09,
        "ozone": 349.87
      },
      ...
    ]
  },
  "daily": {
    "summary": "Partly Cloudy",
    "icon": "partly-cloudy-day",
    "data": [ ... ]
  },
  "alerts": [],
  "flags": {
    "sources": [
      "ETOPO1",
      "gfs",
      "gefs",
      "hrrrsubh",
      "hrrr_0-18",
      "nbm",
      "nbm_fire",
      "hrrr_18-48"
    ],
    "sourceTimes": {
      "hrrr_subh": "2024-08-15 21Z",
      "hrrr_0-18": "2024-08-15 21Z",
      "nbm": "2024-08-15 12Z",
      "nbm_fire": "2024-08-15 12Z",
      "hrrr_18-48": "2024-08-15 18Z",
      "gfs": "2024-08-15 06Z",
      "gefs": "2024-08-15 12Z"
    },
    "nearest-station": 0,
    "units": "si",
    "version": "V2.1"
  }
}

API Endpoint

Production

Location

Massachusetts

Other details

No response

Troubleshooting steps

[X] I have searched this repository and Home Assistant Repository to see if the issue has already been reported.
[X] I have read through the API documentation before opening this issue.
[X] I have written an informative title.

alexander0042 commented 1 month ago

Hi, Thanks for opening this issue, it's an interesting one! What's happening here is that I have two back-end servers for redundancy that are ingesting model files, and while they're mostly in sync, there is a bit of wiggle between them, which is what you're seeing. It's an unusually large jump for precipitation probability between runs, but can sometimes happen depending on the run.

What's strange here is that the AWS load balancer out front should route requests to the same host, so shouldn't bounce back and forth like this. Can I ask if you're accessing this from behind a VPN, or how you're making the calls?

alexander0042 commented 1 month ago

Quick additional thought: I'm pushing out v2.1.1 with an additional header giving a node-id to show which node it's from. I've wanted to add this for my own troubleshooting for a while, so this was the reason to do it!

cloneofghosts commented 1 month ago

While I'm not seeing this over a span of a couple of minutes I'm also seeing different NBM runs occasionally just querying the API in my browser.

I've also noticed that the NBM and GFS source times have gotten stuck again with the latest runs being from yesterday.

"sourceTimes": {
  "hrrr_subh": "2024-08-16 12Z",
  "hrrr_0-18": "2024-08-16 11Z",
  "nbm": "2024-08-15 12Z",
  "nbm_fire": "2024-08-15 12Z",
  "hrrr_18-48": "2024-08-16 06Z",
  "gfs": "2024-08-15 06Z",
  "gefs": "2024-08-16 06Z"
},

When I looked last night the production API was the only one which was stuck as the development endpoint was working fine but this morning both are stuck with outdated runs.

pantherman594 commented 1 month ago

I saw it both across multiple cURL calls and from simply opening the URL in my browser. The examples pasted above were from the latter. I'm not accessing from behind a VPN or anything.

I initially noticed the difference from discrepancies between an application (go, making http get requests) running on a VPS and local testing, which sounds like it's not unexpected, due to the load balancer. However it seemed like once the local instance switched backend servers, the remote one did as well on the next call. I didn't test this extensively so they might not have actually switched at the same time, but both were definitely switching.

cloneofghosts commented 1 month ago

I'm assuming the intermittent internal server errors and the other weird glitches on the API endpoint are due to you fixing this issue?

alexander0042 commented 1 month ago

Yup, exactly that. Since it's an AWS infrastructure thing, there's a bunch of restarts involved. API endpoint should be stable now though!

cloneofghosts commented 1 month ago

Yup, everything seems stable now. Has this issue been fixed or should be leave it open for the weekend to see if it pops up again?

cloneofghosts commented 3 weeks ago

The servers were restarted Saturday night which fixed the inconsistent model source times and version number. I held off initially closing this one to make sure the source times and version number have been stable which it has. Will close this issue for now but if it pops up again we can re-open and investigate.

cloneofghosts commented 2 weeks ago

@alexander0042 This seems to be happening again.

Sometimes I see:

"sourceTimes": {
  "hrrr_0-18": "2024-08-26 14Z",
  "nbm": "2024-08-26 12Z",
  "nbm_fire": "2024-08-26 06Z",
  "hrrr_18-48": "2024-08-26 12Z",
  "gfs": "2024-08-26 06Z",
  "gefs": "2024-08-26 06Z"
},

and other times I see the updated run times:

"sourceTimes": {
  "hrrr_subh": "2024-08-26 20Z",
  "hrrr_0-18": "2024-08-26 19Z",
  "nbm": "2024-08-26 18Z",
  "nbm_fire": "2024-08-26 12Z",
  "hrrr_18-48": "2024-08-26 18Z",
  "gfs": "2024-08-26 12Z",
  "gefs": "2024-08-26 12Z"
},

All I'm doing is querying the API in my browser and I get different results.

alexander0042 commented 2 weeks ago

Good catch, and fixing this now avoided an outage! I was doing some more work in support of self hosting/ improving performance by merging the syncing and response containers; however, one of the restarts sort of corrupted the file system and prevented ingests. I've restarted the misbehaving instance, so should be all synced up in about 30 minutes or so

cloneofghosts commented 2 weeks ago

I'll keep an eye on it thanks. If you're curious this is what the history graph shows for the NBM update time sensor:

cloneofghosts commented 2 weeks ago

Seems to be good now so will close.

cloneofghosts commented 3 days ago

@alexander0042 Seeing the issue again this evening. Seeing a mix of V2.2 and V2.3 but I see no difference between the two versions besides the source differences

"sourceTimes": {
      "hrrr_subh": "2024-09-13 00Z",
      "hrrr_0-18": "2024-09-12 23Z",
      "nbm": "2024-09-12 23Z",
      "nbm_fire": "2024-09-12 18Z",
      "hrrr_18-48": "2024-09-12 18Z",
      "gfs": "2024-09-12 18Z",
      "gefs": "2024-09-12 18Z"
    },
    "nearest-station": 0,
    "units": "ca",
    "version": "V2.2"

V2.3 with outdated runs and it's also missing HRRR subhourly:

"sourceTimes": {
      "hrrr_0-18": "2024-09-12 18Z",
      "nbm": "2024-09-12 15Z",
      "nbm_fire": "2024-09-12 12Z",
      "hrrr_18-48": "2024-09-12 18Z",
      "gfs": "2024-09-12 12Z",
      "gefs": "2024-09-12 12Z"
    },
    "nearest-station": 0,
    "units": "ca",
    "version": "V2.3"
  }

cloneofghosts commented 2 days ago

From what I can tell this issue seems to be fixed. Still getting a mix of versions.

Pirate-Weather / pirateweather