MickMake / GoSungrow

GoLang implementation to access the iSolarCloud API updated by SunGrow inverters.
https://mickmake.com/
GNU General Public License v2.0
148 stars 42 forks source link

GoSungrow should retry on server failures (with workaround) #104

Closed Paraphraser closed 6 months ago

Paraphraser commented 6 months ago

problem

In #101 at this comment I wrote:

If a crash turns up I'll add that. I have actually had one since the "upgrade" but I didn't capture it. I assumed it was a one off but if it recurs I'll definitely grab some evidence.

I've now had another crash:

2023/12/14 11:14:03 INFO: Syncing 205 entries with HASSIO from getPsDetail.
U?UUU??U?UUUUU??UUUUUUUU?CUU?U?UUU?UUU?UUU?U?UUUUU??UUU?UUUUU?UU?CUU?UUUU??U??UUU??UU?UU?UUU?U?UUUUUU?UU?UUUUUUU?UUU?CU?UU?UUU?UUUUCU?U??UUU?U?UUUU?U?UUUUUUUCUUU?UU??UUUUCUUUUCUU??UU?UUUUUUUUU?CUUUUU?U?UUU?UU?UU?U
2023/12/14 11:19:53 INFO: Syncing 205 entries with HASSIO from getPsDetail.
2023/12/14 11:19:53 INFO: Syncing 514 entries with HASSIO from queryDeviceList.
UUUUUU?UU?UU?UUU?U?U?UUU????UU??U?U??CU?UUUUCUUU?U?U??UUUUU?UU?U??U???UUUU?U?UCUCUUUUUUU???UUUUUUUUUUUUUU?UU??U?UUUUUUU?UUU??UU?U?UU?UUUCUUUU??UUCUUU?UUUUUUCU?UU?CUUU?UUUUCUU?UUUUUUUUUUUUUCUCU?UUUUUCUUU?CUU?CU??UUCUCUU?UU
U-UU-UU---U-UCU--UUU-UUU-U-UU-U--UCU--UU-
2023/12/14 11:19:53 INFO: Syncing 148 entries with HASSIO from getPsList.
UUUUUUUUUUUUUUU?UUUUUUUUUUUCUUUUUUUUUUU?UUUUUUUUUUUUUUCUUUUUUUUUUUUUUUUUU?U?UUUUUUUUUUUUUUUCUUUUUUUUUUUUCUUUUUUUUUUUUUUUU?UUUUUUUUUUUUUUUUUUUUUUUUU?UU?CU
PsId: required
JSON request:   {"ps_id":9999999}

2023/12/14 11:25:43 ERROR: getApi: API httpResponse is 500 Internal Server Error
Error: getApi: API httpResponse is 500 Internal Server Error
Usage:
  GoSungrow mqtt run [flags]

Aliases:
  run, 

Examples:
    GoSungrow mqtt run  

Flags: Use "GoSungrow help flags" for more info.

Additional help topics:

ERROR: getApi: API httpResponse is 500 Internal Server Error
s6-rc: info: service legacy-services: stopping
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped

My guess is that this is exactly what it seems: there was a transient problem at the other end. Having sensed this error, GoSungrow has exited and stopped the container.

Other than the hassio_observer container, HomeAssistant has a policy of launching all containers without a restart policy. The reason for the policy is given here.

In effect, the policy places the responsibility on containers to do their best to keep truckin' and only ever exit if there's a solid reason. Therefore, ideally, GoSungrow should accept this responsibility and retry a reasonable number of times before throwing up its hands and exiting.

workaround

If you also find your GoSungrow add-on just stopping from time to time, you can use this workaround to tell Docker that the container should be kept alive:

  1. Start the container in the HA UI.

  2. Use the Advanced SSH & Web Terminal add-on to execute:

    $ GOSUNGROW=$(docker ps -a --format "table {{.Names}}" | grep gosungrow)
    $ docker update --restart unless-stopped $GOSUNGROW
  3. Confirm that the policy has been applied:

    $ docker inspect $GOSUNGROW | jq .[0].HostConfig.RestartPolicy.Name
    "unless-stopped"