BJReplay / ha-solcast-solar

Solcast Integration for Home Assistant
Apache License 2.0
181 stars 34 forks source link

System health exception #203

Closed autoSteve closed 6 days ago

autoSteve commented 2 weeks ago

Discussed in https://github.com/BJReplay/ha-solcast-solar/discussions/185

Originally posted by **bert1111** October 17, 2024 Is this a solcast integration error or HA itself? I see this in the logs after updating solarman integration and rebooting. I am on the latest version of everything. `Logger: homeassistant.components.system_health Bron: components/system_health/__init__.py:89 integratie: Systeemstatus (documentatie, problemen) Eerst voorgekomen: 15:10:03 (3 gebeurtenissen) Laatst gelogd: 16:09:47 Error fetching info Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/components/system_health/__init__.py", line 89, in get_integration_info data = await registration.info_callback(hass) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/solcast_solar/system_health.py", line 27, in system_health_info used_requests = coordinator.solcast.get_api_used_count() ^^^^^^^^^^^^^^^^^^^ AttributeError: 'mappingproxy' object has no attribute 'solcast'`
autoSteve commented 2 weeks ago

Continue here @bert1111.

autoSteve commented 2 weeks ago

To be honest, the system health return is currently terrible @BJReplay, and of almost no use anyway. I am inclined to drop used requests and the rooftop sites count, leaving can_reach for the URL. This would avoid the exception entirely.

Additionally, the old way of looking at system health is no more in the current HA version that I can find. It was of some use back in 2019/2020, then I forgot all about it.

async def system_health_info(hass: HomeAssistant) -> dict[str, Any]:
    """Get info for the info page."""
    coordinator: SolcastUpdateCoordinator =list(hass.data[DOMAIN].values())[0]
    used_requests = coordinator.solcast.get_api_used_count()

    return {
        "can_reach_server": system_health.async_check_can_reach_url(hass, SOLCAST_URL),
        "used_requests": used_requests,
        "rooftop_site_count": len(coordinator.solcast.sites),
    }
BJReplay commented 2 weeks ago

My gut says HA glitch.

My early call is that we will get no useful information allowing repro.

Agreed on dropping rooftops.

Almost agreed on used requests. Only reason why not is that it is a bit of a 429 tell-tale if it all goes horribly wrong (10 retries fail) and debug logging isn't on.

autoSteve commented 2 weeks ago

Agree on the highly likely glitch.

bit of a 429 tell-tale

Good point, but where can one see this information in the modern HA user interface or any diagnostic dump? Maybe I'm not looking hard enough... Get the API used from the sensor if needed. All of this info is included in the diagnostic dump, and this code seems a bit of a mostly-not-helpful relic.

Helpful would be a counter of the number of 429s hit for the past day, which should probably be in the diagnostic dump.

BJReplay commented 2 weeks ago

Agreed

bert1111 commented 2 weeks ago

I can reproduce it by going to Repairs=> System information afbeelding

BJReplay commented 2 weeks ago

Thanks, @bert1111, that is helpful, so can I.

autoSteve commented 2 weeks ago

Reproduced as well.

It'll be gone in the next version, quite possibly replaced by something actually helpful... Thanks @bert1111.

2024-10-22 19:31:44.191 ERROR (MainThread) [homeassistant.components.system_health] Error fetching info
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/system_health/__init__.py", line 89, in get_integration_info
    data = await registration.info_callback(hass)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/solcast_solar/system_health.py", line 27, in system_health_info
    used_requests = coordinator.solcast.get_api_used_count()
                    ^^^^^^^^^^^^^^^^^^^
AttributeError: 'mappingproxy' object has no attribute 'solcast'
autoSteve commented 2 weeks ago

It's useless, but at least it now works as is. This has been broken for a very long time...

image
autoSteve commented 2 weeks ago

And it's wrong! 😂 "API requests remaining" is actually API requests used so far for the day. Lols. No one looks at this.

And more Urdu translations will be required to extend it. 🤦🏻

Why did I not just let you silently close that discussion BJ?

BJReplay commented 2 weeks ago

Why did I not just let you silently close that discussion BJ?

🤷

I have relatively recently used system information when creating an issue on HA without it throwing that error (I think) - what was the fix?

Could a relatively recently made translation change have caused it?

autoSteve commented 2 weeks ago

Because no reason that no one will look at, I am going to modify this to record the server connection 'ok', and the total number of failed forecast fetches for the UTC day period, including retry attempts.

And to support competition I may also add the total number of 429s seen since records began, which will be v4.2.5.

@BJReplay, good point. What this code did before was a crap-shoot, and I definitely broke it by extending the volatile domain data stored, as one can. It got the very first list item [0] of the domain values converted to a list, which almost certainly now returns the last recorded entry options, hence the exception. So it was kind of broken forever, having worked as Oziee code, hoping that some hapless developer wouldn't stumble upon the bomb.

It was inelegant, and doomed to fail, assuming that the coordinator was the only item that would be in the list of volatile data.

async def system_health_info(hass: HomeAssistant) -> dict[str, Any]:
    """Get info for the info page."""
    domain_values = list(hass.data[DOMAIN].values())
    _LOGGER.debug(domain_values)
    coordinator = None
    for v in domain_values:
        if isinstance(v, SolcastUpdateCoordinator):
            coordinator: SolcastUpdateCoordinator = v

    used_requests = coordinator.solcast.get_api_used_count()

    return {
        "can_reach_server": system_health.async_check_can_reach_url(hass, SOLCAST_URL),
        "used_requests": used_requests,
        "rooftop_site_count": len(coordinator.solcast.sites),
    }
gcoan commented 2 weeks ago

System Information is clearly a little used corner of HA. Apart from various diagnostics from Home Assistant, Supervisor, Recorder, etc, the only non-HA components that show up in my System Information list are:

No other integrations or add-on's have bothered feeding it with data

autoSteve commented 2 weeks ago

(Python tute: Python dictionaries are inherently unordered, so when converted to a list you do not have a guaranteed order so you often sort with lambda on a dict key to ensure an order if that's what you need. The integration does this with forecast periods. What generally happens in reality is that the order that dict items are added tends to be retained in memory, unless adds/moves/changes so it often seems to work. I add the entry options before the coordinator gets instantiated, so entry options is almost always first-ish. Enter the almost always-ish exception.)

I know, @gcoan! (But there are some obscure ones out there that do.) Measuring 429s might be a fun project though.

Happy to be guided buy popular opinion. I honestly don't need more Urdu practice.

BJReplay commented 2 weeks ago

System Information may be little used, but is essential for raising issues on HA or any core integrations (as it is requested by the issue template).

That doesn't mean that we need to embrace it as a place to report data for our integration (Diagnostics are more obvious and discoverable), but I have used it half a dozen times and I have only been using HA for six months!

autoSteve commented 2 weeks ago

Re-thinking: I am going to drop the 'requests remaining', 'rooftop sites' and abandon all thoughts of Urdu translations.

The current information is misleading, and no one looks at it.

Forget 429 counts. Hopefully we can solve for them, so it will be a non-issue. (Warms up electric television...)

I will leave the code stub that gets the coordinator. Someone may find it useful in future.

BJReplay commented 2 weeks ago

Re-thinking

Hard Agree

As little as possible to fix is the right amount.

I'd consider pulling the entire system health hook but haven't yet read enough to know if that is good or bad.

Solcast Server reachable is not currently available anywhere else, but implicit in successful initial set up and forecast fetch.

Everything else is available elsewhere in diagnostics.

autoSteve commented 2 weeks ago
async def system_health_info(hass: HomeAssistant) -> dict[str, Any]:
    """Get info for the info page."""
    #for v in hass.data[DOMAIN].values():
    #    if isinstance(v, SolcastUpdateCoordinator):
    #        coordinator: SolcastUpdateCoordinator = v

    return {
        "can_reach_server": system_health.async_check_can_reach_url(hass, SOLCAST_URL),
    }
autoSteve commented 2 weeks ago
image
BJReplay commented 2 weeks ago

I'd consider pulling the entire system health hook but haven't yet read enough to know if that is good or bad.

I think that system health is entirely optional. It just isn't mentioned in the development documentation.

It does not appear in any of the examples.

It does not seem to be the way things are done are any more.

Just for shits and giggles I deleted system_health.py and restarted HA - of course it fixes the system information display, and the integration just keeps working perfectly.

github-actions[bot] commented 1 week ago

This issue is stale because it has been open 7 days with no activity. Remove stale label or comment or this will be closed in 2 days.