fredlcore / BSB-LAN

LAN/WiFi interface for Boiler-System-Bus (BSB) and Local Process Bus (LPB) and Punkt-zu-Punkt Schnittstelle (PPS) with a Siemens® controller used by Elco®, Brötje® and similar heating systems
222 stars 84 forks source link

[BUG] HTTP authentification is broken #492

Closed DE-cr closed 1 year ago

DE-cr commented 1 year ago

Before submitting a bug report, please make sure that you have checked chapter 14 of the manual ("Problems and their Possible Causes"). Most problems that are reported to us can be solved by going through the solutions provided there.

=> Done: 14.4 is related, but I have avoided the use errors described there.

BSB-LAN Version As displayed in the web-configuration or copy from bsb-version.h file.

=> 2.1.8-20220731102301 (actually, Github master contents ca. 20221005210000)

Architecture The architecture BSB-LAN is running on (Arduino Mega, Arduino Due, ESP32 NodeMCU, ESP32 Olimex EVB etc.)

=> ESP32 NodeMCU

Bus system Which bus BSB-LAN is running on (BSB, LPB, PPS)?

=> BSB

Describe the bug A clear and concise description of what the bug is.

=>

  1. After setting up "HTTP authentification" via the web interface (/C), further http access attempts to BSB-LAN's main web interface require and accept authentification, as expected.
  2. A subsequent attempt to access the OTA interface (port 8080) asks for authentification (as expected), but does not deliver the corresponding web page!
  3. The second step may even lead to the standard web interface (port 80) not accepting the correct authentification any more!! (This does not always happen, though.)
  4. A hardware reset of the BSB-LAN unit takes you back to the first step - including the fact, that the "HTTP authentification" parameter on the /C page is now empty again.

Please note: The above steps may not be 100 % reproducible! However, when #2 works as expected, restarting the browser and trying #2 again is likely to fail.

I had already experienced rare problems as described in #2 in the past (both on ports 80 and, more likely, 8080), but those where few and far between, and never led to #3.

Things I have changed before the (now persistent) problem occurred:

(Solution for me: avoid HTTP authentification until this issue is cleared)

Desktop (if applicable, please complete the following information):

todor-dk commented 1 year ago

I have similar experience, but didn't spent much time.

I updated my Olimex-ESP32-EVB board from 2.0 to 3.0.4 and kept the HTTP passkey and HTTP auth parameters. While still connected via USB, directly after the update, I could access the web interface. Unfortunately, I had chosen German as language, so I only changed the thing to display extended settings, unplugged it from the USB, went to the cellar to connect it to the boiler and the power adapter. Then, no access via the web. I could see it had an IP, responded to ping and I could even Telnet to it in port 80 and try basic HTTP commands entered by hand, where it responded HTTP 401. Recycling power didn't help.

I went to the cellar with the laptop, connected it via USB and could see Arduino monitor message. The web interface worked as well. I downloaded the parameters file that I had to sent to Frederik (thanks for the quick response Frederik !!!) and went to bed.

Today, I re-flashed the firmware with the parameter updates and triple checked authentication parameters. While still running on the USB power, it connects to the web interface. Then, connecting to the normal power adapter, the web interface stopped functioning. For fun, I powered it via USB with a Samsung phone charger and the web interface was working. Crazy!

I did only test two times, as we are going out shopping, but this smells. I know there were boot issues with some board combinations, but could this be so crazy that it affects the initialization and workings of some network module and in the end, of the web server?

For info, I have: ESP32-EVB Rev.I BSB-LAN ESP32 V4.1 (with a botch wire between R6 and one of the pins) I use the Wi-Fi interface The Ethernet is not connected, nor were any other pins. Finally, for fun, I unplugged the BSB-LAN board and it still exhibits the same behavior :/

I will look at your ticket once more and see if there are some parallels to my headaches.

fredlcore commented 1 year ago

These seem to be two unrelated problems: @DE-cr is talking of non-functional HTTP authentication only after accessing the OTA page (correct?), whereas with you it seems to happen generally (correct?). As for the latter problem, we've had quite a few reports that the ESP32 boards seem to require a really good power supply. Even though the amperage could never be needed even technically, several 1A power supplies resulted in such or similar behaviour where the network connection was somehow weird or faulty. Upgrading to a 2.5A power supply from Apple for example fixed this. Maybe it was also related to the cable in those cases, it's hard to tell, except for the fact that supplying the ESP32 with stable power seems to be a must.

todor-dk commented 1 year ago

I will need to investigate if my issue is related to OTA as well. I just find it crazy that the web server authentication fails depending on power supply. The web server still responds valid HTTP responses - it just fails to authenticate.

I've used the Olimex sold power supply (https://www.olimex.com/Products/Power/SY0605E/), which on the device is rated for 1,2A. And so far, no issues with it on FW 2.0.x. But I don't think I had many reboots.

DE-cr commented 1 year ago

@DE-cr is talking of non-functional HTTP authentication only after accessing the OTA page (correct?), whereas with you it seems to happen generally (correct?).

Correct ... iirc (it's been a long time since I've disabled http auth).

fredlcore commented 1 year ago

I tried some of the combinations you mentioned on my USB-powered Olimex and I was able to somewhat reproduce a/the problem mentioned here: I have HTTP auth enabled via my _config.h, but when I remove it, save config, and enter it again and save it and then access the OTA update page, I sometimes get this crash report:

Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandl.

Core  1 register dump:                                                          
PC      : 0x4008ade1  PS      : 0x00060730  A0      : 0x800eca48  A1      : 0x3 
A2      : 0x00000000  A3      : 0xfffffffc  A4      : 0x000000ff  A5      : 0x0 
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x00000000  A9      : 0x3 
A10     : 0x00000000  A11     : 0x3ffb239c  A12     : 0x3ffb23a8  A13     : 0x3 
A14     : 0x00ff0000  A15     : 0xff000000  SAR     : 0x00000015  EXCCAUSE: 0x0 
EXCVADDR: 0x00000000  LBEG    : 0x4008ade1  LEND    : 0x4008adf1  LCOUNT  : 0xf 

Backtrace:0x4008adde:0x3ffb22b00x400eca45:0x3ffb22c0 0x400d3601:0x3ffb23d0 0x40 

ELF file SHA256: 0000000000000000   

It seems that reloading the OTA update page every other time leads to the crash. The interesting thing here is that I added as debug output the HTTP auth credentials that BSB-LAN thinks are the credentials. In those cases where the ESP32 crashes, it would print the string only until (and excluding) the colon. So if the user/pass combination was user:pass, it would only print user. Sometimes the same happens when accessing the config page afterwards. Then also only user would appear in the text field. If you didn't notice that and saved the configuration, this could easily lock you out because then no proper HTTP auth credentials would be stored in the EEPROM.

The strange thing is that in my tests when every other accessing the OTA update page led to a crash, the other calls were fine and the correct HTTP auth credentials were displayed on the serial console. It is this line that causes the crash:

update_server.authenticate(strtok(USER_PASS,":"),strtok(NULL,":"));

The weird thing is that when I dump the output of the two strtok's to console, I get the correct values. So the crash must happen inside the authenticate function. I can also confirm that the crash does not occur when HTTP auth is empty.

But how and why this is related to the fact that sometimes the user:pass combination is cut down to just user really beats me. Maybe @dukess has an idea?

fredlcore commented 1 year ago

I can confirm now that with HTTP auth enabled, every call to the OTA page results in a crash. Disabling HTTP auth helps, as I said, but re-enabling it again will again result in crashes, even after several reboots and/or firmware updates. I can't physically take the Olimex off power, but will try so asap.

fredlcore commented 1 year ago

Ok, it's confirmed, the crashes have to do with the fact that at some point the variable USER_PASS which normally carries the string with the contents of (your) USERNAME:PASSWORD suddenly only contains whatever USERNAME is. Everything after the colon is dropped and therefore strtok(NULL,":") in update_server.authenticate(strtok(USER_PASS,":"),strtok(NULL,":"))fails and results in the crash. I now have to investigate at which point (and more importantly why) the change in this variable occurs. I don't think it has to do with the power supply because therefore the pattern with the drop after the coloon is too obvious...

DE-cr commented 1 year ago

From strtok(3) on my system:

   Be cautious when using these functions.  If you do use them, note that:
   * These functions modify their first argument.
fredlcore commented 1 year ago

Exactly, that's what I just realized after I was able to reproduce the error every time: Start the ESP32, call /C via URL, then call :8080 for update page and then call /C again. USER_PASS will be modified as described above and thus result in a crash later on. I now make a temporary copy of USER_PASS in the OTA function and now the error no longer seems to occur. Please test if possible.

DE-cr commented 1 year ago

Yes, the problem does not re-occur on my system now, thank you! (But please see my comment on one of the code changes you've made.)

fredlcore commented 1 year ago

Had a look at it, but there is no concern in my opinion.

BTW, the reason why this problem never really occurred to me was that I only called the update page when I wanted to update BSB-LAN. Since it reboots after the update, the problem was gone then. Only when opening the update page and then going back to the menu caused this error to occur. So we can finally close this one.