tbnobody / OpenDTU

Software for ESP32 to talk to Hoymiles/TSUN/Solenso Inverters
GNU General Public License v2.0
1.71k stars 472 forks source link

Web gui does not load any more relieable #1000

Open AloisKlingler opened 1 year ago

AloisKlingler commented 1 year ago

What happened?

After the update to the latest 23.6.1 the webgui starts failing to load after some time.

The api call for fetching live data json still works flawless (and fast). Power cycle helps, but after some time again the webgui is not reachable any more.

This does not happen with 23.5.31. I downgraded to this version and it stays reachable also with web gui.

Wifi signal is between -72 and -78.

To Reproduce Bug

I tried to narrow down, but i could not. 😞

Expected Behavior

Webgui should stay available.

Install Method

Pre-Compiled binary from GitHub

What git-hash/version of OpenDTU?

23.6.1

Relevant log/trace output

No response

Anything else?

No response

tbnobody commented 1 year ago

There wasn't changed anything regarding the webserver etc: https://github.com/tbnobody/OpenDTU/compare/v23.5.31...v23.6.1 Only the sunset calculation was changed. Have to tried clearing your browser cache or use a private tab?

AloisKlingler commented 1 year ago

I saw the changeset, and that's why i have reverted and flashed to the new several times to see if it is firmware based. Yes, different browsers, different end devices too, ...

tbnobody commented 1 year ago

Did you try to power cycle the ESP or did you just use the web based reboot feature?

AloisKlingler commented 1 year ago

as I cannot reach the GUI I need to powercycle. the esp is mounted in a IP65 case outside the house, so I need to do this by the circuit breaker.

iukbox commented 1 year ago

Meine Live Ansicht zeigt auch nichts mehr an. Der Rest der GUI lĂ€uft. Ein Neustart ĂŒber die Software ist nicht möglich. Es erscheint eine Fehlermeldung: "Fehler beim interpretieren der Daten" Hardreset ist derzeit nicht möglich, da ich aus dem Urlaub nur VPN Zugriff habe.

tbnobody commented 1 year ago

Meine Live Ansicht zeigt auch nichts mehr an. Der Rest der GUI lĂ€uft. Ein Neustart ĂŒber die Software ist nicht möglich.

Completly different behavior.... In your case the web interface is in the browser cache and the API is not working anymore. Which version are you running?

iomax commented 1 year ago

Wifi signal is between -72 and -78.

Just based on personal experience, see #836, but I noticed that signal strength below -60/-65 could trigger strange activities on the openDTU side. Nonetheless, any AP/Router and esp32 board mix may exhibit different behaviours and I didn't updated yet to latest FW

iukbox commented 1 year ago

Ich habe eben die aktuelle Version zur Sicherheit installiert. Jetzt habe ich eine Anzeige der Live Daten aber die DTU hat keine Verbindung zum WR mehr. PS, jetzt ist die DTU gar nicht mehr erreichbar. Ich kann erst in einer Woche wieder Realdaten liefern, dann bin ich wieder vor Ort.

⁣BlueMail for Android herunterladen ​

Am 5. Juni 2023, 13:32, um 13:32, tbnobody @.***> schrieb:

Meine Live Ansicht zeigt auch nichts mehr an. Der Rest der GUI lĂ€uft. Ein Neustart ĂŒber die Software ist nicht möglich.

Completly different behavior.... In your case the web interface is in the browser cache and the API is not working anymore. Which version are you running?

-- Reply to this email directly or view it on GitHub: https://github.com/tbnobody/OpenDTU/issues/1000#issuecomment-1576541600 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

AloisKlingler commented 1 year ago

@tbnobody I have reverted commits 114ebb2 and 5558dff (the "under the hood" changes of 23.6.1) and it now works for me again. if somebody else wants to try: firmware.zip

tbnobody commented 1 year ago

But then it's only a problem with your browser. Because these data get just downloaded to your browser and executed there. This has nothing todo with the code on the ESP.

AloisKlingler commented 1 year ago

if I would not have tried several browsers and several end units (mobile phone, win11 client, two different win10 clients, debian 11 linux) I would be the same opinion. it's really weird. I also can see the relieable livedata json comming trough.

AloisKlingler commented 1 year ago

the joy was only very short ... also this gets broken after a while. :-( in browser debug tools I can see: http://192.168.0.247/js/app.js net::ERR_CONTENT_LENGTH_MISMATCH 200 (OK)

DejanBukovec commented 1 year ago

I can confirm that there is some issue with showing live data in browser... It stop showing data in Firefox and EDGE browser... Firefox: slika

EDGE: slika

After Refresh(CTRL+F5) it for few ms show text "Live Data" and progress circle and then they disapear...

tbnobody commented 1 year ago

And I can confirm that it's working for > 5 days without any problems: image

But this not help. What do you see in the serial console (not the web based one) in the moment when you refresh the page?

DejanBukovec commented 1 year ago

In monitor I see only this:

17:38:15.582 > Websocket: [/livedata][211] connect
17:38:16.293 > Websocket: [/livedata][210] disconnect
tbnobody commented 1 year ago

If you enter the development tools of your browser, there should be a tab called "network" where you can see all the requests your browser is performing. Following background:

DejanBukovec commented 1 year ago

Network is looks ok without errors: slika

But in console I get this error: slika

tbnobody commented 1 year ago

The 150byte for the Websocket response looks wrong. This would also explain the error in the console. I would bet that if you click onto the livedata line in the network tab and the look at the response data (maybe separate tab) you will see an empty or null response. Question is just why. If there would be too less memory you should see a message in the serial output.

DejanBukovec commented 1 year ago

Do you mean this: slika

Maybe just idea can be maybe password related because password include special characters and also special character "@" ?

AloisKlingler commented 1 year ago

Do you mean this: slika

Maybe just idea can be maybe password related because password include special characters and also special character @ ?

my password does not contain any special character.

DejanBukovec commented 1 year ago

Im also try clear cokies DTU has lose dark mode, language settings and logout me. Then Im relogin and it didn't help. Enable "Allow readonly access to web interface without password " also do not help.

tbnobody commented 1 year ago

hm no, cookies etc don't have any impact on the backend. The frontend is completly independent of the backend. It just communicates using the api endpoints and exchanges json data. if the web socket reply stays empty are you still able to call the /api/livedata/status endpoint and do you get a proper response?

Are you running any other software which accesses this endpoint very often?

AloisKlingler commented 1 year ago

in my case node red queries /api/livedata/status every second. if I do this by browser, I get normally a speedy response. sometimes if fetches the half, the other half comes a moment later (but wthout timeout). and sometimes only "null" is displayed.

DejanBukovec commented 1 year ago

In my case DTU is not in production and I have enabled MQTT and integrated it into home assistant. I have prepared zero export automation in HA but do not work(Is disabled) beacause inverters are currently offline(In process of installing). I don't set/request over MQTT or WEB API anything...

If I call /api/livedata/status from browser I get null response.

In MQTT Explorer I see updates in solar/dtu status and inverters reachable/producing/last_update. But onder HA binary_sensors I don't see that update will be send... Maybe this updates are send only on change or if inverters are online?

Maybe one thing. Im update FW few days ago maybe at same day or next day after release. But looks like DTU reset one day ago: slika

tbnobody commented 1 year ago

in my case node red queries /api/livedata/status every second.

If you get null then you are out of memory. You can either call the /api/livedata/status endpoint OR the websocket. Both is not possible. Collecting this data and preparing the whole request requires ~40kb of continous memory. This is not always given. (The ESP has 300kb of memory) The web API was never intended to be called every second. You will not even get new data from the inverter every second.

tbnobody commented 1 year ago

If I call /api/livedata/status from browser I get null response.

What do you see in the serial console if this happens?

But onder HA binary_sensors I don't see that update will be send... Maybe this updates are send only on change or if inverters are online?

I don't know what you HA is doing but it uses the same topics under solar/#. So if this topics are update everything is fine and the problem relies most likley on the HA side.

DejanBukovec commented 1 year ago

My debugger has stop working(Im by acident pres some button)...

Now Im get it working... Im change in MQTT settings "Publish Interval" from 5 seconds to 15 seconds and after save Live View start working. Now Im change it back to 5 seconds and still work... I will leave debugger connected and see if it will again disapear and what will be in debug log...

tbnobody commented 1 year ago

I will add an additional exception handler in the next release. the current one only shows messages if there is too less memory. but maybe there is something different. (will push this later)

AloisKlingler commented 1 year ago

I had in settings -> DTU settings the "Poll Interval" set to 3 seconds. Since I changed this to 10 Seconds, it is running much more stable. (just removing the secondly polling of the live statistics did not help)

AloisKlingler commented 1 year ago

@tbnobody just a question - in system info I can see "Heap" normally at around 150kB and very seldom at around 100kB

what does the RAM mostly consume? the communication to the inverter and processing it's data, or the json'ing and offering by HTTP? If the second, would it be an option to have a shorter e.g. "/api/livedata/totals" endpoint which only offers current total (overall, day, power) of all inverters?

Thanks. :-)

//btw: the reason for "each second" was for me to be lazy and just enslave the opendtu in the same node red flow as my fronius inverters are. I am collecting secondly data (and processing them at current production, current consumption, current grid usage, etc) from all of them for visualization.

tbnobody commented 1 year ago

@tbnobody just a question - in system info I can see "Heap" normally at around 150kB and very seldom at around 100kB

That doesn't make any sense. How many inverters have you connected? My default heap usage is at around 126kB with 3 inverters.

The generation of the json response takes ~40kb.

AloisKlingler commented 1 year ago

One inverter, hmt-2250-6t I do not know any more the free memory of your OpenDTU. The memory of ~150k fits to the version with modbus: https://github.com/AloisKlingler/OpenDTU-FroniusSM-MB The modbus registers and the library ofc need also some memory. I do not know how much.

DejanBukovec commented 1 year ago

My with MQTT + 3 inverters(offline) have used 151kB and Free 130kB

delacor commented 1 year ago

I also got issues under /settings/device it spins forever, the log shows this:

Details

``` Uncaught (in promise) TypeError: this.pinMappingList.sort is not a function getPinMappingList http://10.4.20.187/js/app.js:54 getPinMappingList http://10.4.20.187/js/app.js:54 created http://10.4.20.187/js/app.js:54 Tt http://10.4.20.187/js/app.js:2 xt http://10.4.20.187/js/app.js:2 Kn http://10.4.20.187/js/app.js:2 Gn http://10.4.20.187/js/app.js:2 go http://10.4.20.187/js/app.js:2 i http://10.4.20.187/js/app.js:2 H http://10.4.20.187/js/app.js:2 H http://10.4.20.187/js/app.js:2 R http://10.4.20.187/js/app.js:2 b http://10.4.20.187/js/app.js:2 s http://10.4.20.187/js/app.js:2 run http://10.4.20.187/js/app.js:2 update http://10.4.20.187/js/app.js:2 Tt http://10.4.20.187/js/app.js:2 Wt http://10.4.20.187/js/app.js:2 promise callback*Ft http://10.4.20.187/js/app.js:2 Ht http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 set value http://10.4.20.187/js/app.js:2 S http://10.4.20.187/js/app.js:65 b http://10.4.20.187/js/app.js:65 promise callback*b http://10.4.20.187/js/app.js:65 g http://10.4.20.187/js/app.js:65 navigate http://10.4.20.187/js/app.js:65 n http://10.4.20.187/js/app.js:2 Tt http://10.4.20.187/js/app.js:2 app.js:54:17720 getPinMappingList http://10.4.20.187/js/app.js:54 thenFinally self-hosted:2344 (Async: promise callback) finally self-hosted:2332 getPinMappingList http://10.4.20.187/js/app.js:54 created http://10.4.20.187/js/app.js:54 Tt http://10.4.20.187/js/app.js:2 xt http://10.4.20.187/js/app.js:2 Kn http://10.4.20.187/js/app.js:2 Gn http://10.4.20.187/js/app.js:2 go http://10.4.20.187/js/app.js:2 i http://10.4.20.187/js/app.js:2 H http://10.4.20.187/js/app.js:2 H http://10.4.20.187/js/app.js:2 R http://10.4.20.187/js/app.js:2 b http://10.4.20.187/js/app.js:2 s http://10.4.20.187/js/app.js:2 run http://10.4.20.187/js/app.js:2 update http://10.4.20.187/js/app.js:2 Tt http://10.4.20.187/js/app.js:2 Wt http://10.4.20.187/js/app.js:2 (Async: promise callback) Ft http://10.4.20.187/js/app.js:2 Ht http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 effect http://10.4.20.187/js/app.js:2 we http://10.4.20.187/js/app.js:2 ye http://10.4.20.187/js/app.js:2 bt http://10.4.20.187/js/app.js:2 set value http://10.4.20.187/js/app.js:2 S http://10.4.20.187/js/app.js:65 b http://10.4.20.187/js/app.js:65 (Async: promise callback) b http://10.4.20.187/js/app.js:65 g http://10.4.20.187/js/app.js:65 navigate http://10.4.20.187/js/app.js:65 n http://10.4.20.187/js/app.js:2 Tt http://10.4.20.187/js/app.js:2 ```

tbnobody commented 1 year ago

@delacor which json file did you upload? can you show a screenshot of the json file?

delacor commented 1 year ago

@delacor which json file did you upload? can you show a screenshot of the json file?

Ok, thanks for the hint. There was still things left from openDTUOnBattery, although I did do a factory reset under the new OpenDTU. (is there a better method to clear the device?) Its loading fine now.

tbnobody commented 1 year ago

@delacor Factory reset does not delete the pin_mapping.json because the pin_mapping is depending on your hardware which stays the same in this moment. The Factory Reset just deletes the config.json.

The best way when flashing a new type of firmware is a complete ESP erase. (Can be chosen in almost any flashing tools)

avandorp commented 10 months ago

I have the empty Live Data view when enabling MQTT. Doesn't seem to be a memory issue (according to the System view). JS console is showing "Uncaught TypeError: can't access property "inverters", this.liveData is null". Some sort of race condition? Setting the MQTT frequency to 60s (from 5s) didn't help. MQTT data flows as intended. Newest firmware.

tbnobody commented 10 months ago

Newest firmware.

Please try a power cycle after the upgrade. (Not just reboot)

avandorp commented 10 months ago

Thanks. Unfortunately no luck. Both Firefox and Chrome, private browsing or not show the same console output and an empty Live View. MQTT and DTU polling frequency both set to 60s didn't help either.

tbnobody commented 10 months ago

Did you reload the web ui properly after the firmware upgrade (CTRL+F5 or clearing the cache?)

avandorp commented 10 months ago

Yes. Multiple times and with new browsers/devices. Now the it seems to have made a sort of factory reset. Back to the base settings, local access point, no inverters, nothing configured. đŸ€· Oh well. Will report back once I've configured everything again (must use that backup tool...).

avandorp commented 10 months ago

Ok. Everything is set up again. As soon as I enable MQTT (uses TLS, might be an additional strain on the ESP32 or additional cause for any kind of race conditions), the Live Data isn't shown anymore. Disabling MQTT makes the Live Data reappear. (Edit: MQTT Home Assistant auto discovery is enabled as well).

avandorp commented 10 months ago

Interestingly /api/prometheus/metrics shows "too many requests" and /api/livedata/status actually returns a "null", meaning an actual string with four characters/bytes. The response headers are: HTTP/1.1 200 OK Content-Length: 4 Content-Type: application/json Connection: close Accept-Ranges: none

avandorp commented 10 months ago

Thanks for your help. openDTU now sends MQTT messages reliably for 50 hours, the livedata/status still yields a string "null". /api/eventlog/status?inv=XXX works (even though getting the events by MQTT would be preferable), therefore I have all the data I need. Still a weird bug. đŸ€·

avandorp commented 9 months ago

Version 23.9.18 solved this problem for me. Thank you!

helgeerbe commented 6 months ago

@tbnobody this might be a hint.

On openDTU-onBattery we extended the datasets wich are handled by DynamicJasonDocument but we don't test, if the document space is really allocated by DynamicJasonDocument::capacity()

Here is an example where I put some logging in void WebApiWsLiveClass::loop()

void WebApiWsLiveClass::loop()
{
    // see: https://github.com/me-no-dev/ESPAsyncWebServer#limiting-the-number-of-web-socket-clients
    if (millis() - _lastWsCleanup > 1000) {
        _ws.cleanupClients();
        _lastWsCleanup = millis();
    }

    // do nothing if no WS client is connected
    if (_ws.count() == 0) {
        return;
    }

    if (millis() - _lastInvUpdateCheck < 1000) {
        return;
    }
    _lastInvUpdateCheck = millis();

    uint32_t maxTimeStamp = 0;
    for (uint8_t i = 0; i < Hoymiles.getNumInverters(); i++) {
        auto inv = Hoymiles.getInverterByPos(i);

        if (inv->Statistics()->getLastUpdate() > maxTimeStamp) {
            maxTimeStamp = inv->Statistics()->getLastUpdate();
        }
    }

    // Update on every inverter change or at least after 10 seconds
    if (millis() - _lastWsPublish > (10 * 1000) || (maxTimeStamp != _newestInverterTimestamp)) {

        try {
            std::lock_guard<std::mutex> lock(_mutex);
            DynamicJsonDocument root(4200 * INV_MAX_COUNT); // TODO(helge) check if this calculation is correct
            JsonVariant var = root;

            MessageOutput.printf("Calling /api/livedata/status jsonDocument capacity = %d.\r\n", root.capacity());
            MessageOutput.printf("Calling /api/livedata/status FreeHeap = %d, MaxAllocHeap = %d.\r\n", ESP.getFreeHeap(), ESP.getMaxAllocHeap());
            generateJsonResponse(var);
            MessageOutput.printf("Calling /api/livedata/status JsonDocumentSize = %d (allocated %d).\r\n", root.memoryUsage(), 4200 * INV_MAX_COUNT);

When DynamicJasonDocument::capacity()returns0the api call returnsnull. Interesting side effect, whencapacity()does not return0` mqtt does not publish quite offten. Extending the mqtt publish interval to 15 seconds, seems to help a little bit.

Actual I do further testing. But I guess I have to lower the max number of supported inverters from 5 to 3 for openDTU-onBattery. And we should check each createion of DynamicJasonDocument, if it was succuessfull.

tbnobody commented 5 months ago

This should be fixed in version v24.2.12 as the live data entry point consumes at lot less memory.

ottelo9 commented 3 months ago

Yesterday I had the exact same problem. I have openDTU-onBattery 2024.03.07 . The ESP has a poor RSSI of -87. OpenDTU send all informations over MQTT to HA but the webpage is simply empty (white) without content. I've done a power cycle but with no success. The server is reachable but empty page. After a second power cycle and slightly new better positioning of the openDTU device I got a working webpage back.

stefan123t commented 4 days ago

what does the RAM mostly consume? the communication to the inverter and processing it's data, or the json'ing and offering by HTTP? If the second, would it be an option to have a shorter e.g. "/api/livedata/totals" endpoint which only offers current total (overall, day, power) of all inverters?

@AloisKlingler & @helgeerbe as JSON documents or parts of it can consume quite some memory before they are sent out: Would it be possible to break a larger JSON document template into multiple smaller chunks, e.g. /api/livedata/status being sent as follows. This could even make use of other API endpoints ( or at least their methods() ) to produce the necessary parts of the whole template.

{
  {
    "inverters": [
      <inv(0)>,
      <inv(1)>,
       ...
      <inv(N)>
    ]
  },
  "total": <total()>,
  "hints": <hints()>,
  "vedirect": <vedirect()>,
  "huawei": <huawei()>,
  "battery": <battery()>,
  "power_meter": <power_meter()>
}
Details

```json { "inverters": [ { "serial": "114173212345", "name": "HM-600", "order": 0, "data_age": 3, "poll_enabled": true, "reachable": true, "producing": true, "limit_relative": 100, "limit_absolute": 600 }, ... ], "total": { "Power": { "v": 23.60000038, "u": "W", "d": 1 }, "YieldDay": { "v": 1098, "u": "Wh", "d": 0 }, "YieldTotal": { "v": 1612.475098, "u": "kWh", "d": 3 } }, "hints": { "time_sync": false, "radio_problem": false, "default_password": true }, "vedirect": { "enabled": false }, "huawei": { "enabled": false }, "battery": { "enabled": false }, "power_meter": { "enabled": true, "Power": { "v": 121.526001, "u": "W", "d": 1 } } } ```