fredlcore / BSB-LAN

LAN/WiFi interface for Boiler-System-Bus (BSB) and Local Process Bus (LPB) and Punkt-zu-Punkt Schnittstelle (PPS) with a Siemens® controller used by Elco®, Brötje® and similar heating systems
220 stars 83 forks source link

Disable ESP32 CpuFrequency reduction for about 20 % faster /D #556

Closed DE-cr closed 1 year ago

DE-cr commented 1 year ago

I've done the following timing test, comparing CPU frequency standard (240 MHz) to reduced to 80 MHz on my system (esp32, no SD card, wifi connection):

Read 1 MB from file and transfer via http, using 4 KB buffers.

The results [s] are as follows:

      80 MHz   240 MHz   gain       
----+--------+---------+-----
 #1     4,16      3,69   11 %
 #2     4,79      5,20   -9 %
 #3     4,62      4,60    0 %
 #4     4,14      3,66   12 %
 #5     4,81      2,81   42 %
 #6     3,96      2,99   24 %
 #7     3,62      2,90   20 %
 #8     3,94      2,62   34 %
 #9     4,41      3,04   31 %
#10     3,55      2,99   16 %
----+--------+---------+-----
min     3,55      2,62   26 %
max     4,81      5,20   -8 %
avg     4,20      3,45   18 %
DE-cr commented 1 year ago

With more testing, this PR's impact looks even bigger: with a datalog.txt of 260 KB, a typical time for /D on my system is 0.75 s at full CPU frequency, compared to 1.5 s (twice as long!) at 80 MHz. (Transmission times do vary in both configurations, but I haven't been able to get considerably lower values than the ones given here.)

fredlcore commented 1 year ago

Cool find, but rather than disabling the reduction completely, I'd suggest we increase CPU speed during log file output and then return to 80 MHz for everyday tasks (which are probably more than enough).

DE-cr commented 1 year ago

I did try that once, but switching CPU speed back and forth seems to be problematic.

fredlcore commented 1 year ago

In which way? Because since the ESP32 starts by default with 240 MHz, what we are doing currently is also at least "switching back". Doing that on the (rather not so frequent) occasions when downloading data would not seem to be that much different?

DE-cr commented 1 year ago

That's what I thought. I don't remember how the system behaved when I tried to go to 240 MHz before /D and back to 80 MHz afterwards - only that it made me abandon that approach. (After all, isn't it about 0.2 W in savings only?)

I'll look into alternatives, though.

DE-cr commented 1 year ago

Also: are you sure that only /D... would benefit from higher CPU frequencies?

DE-cr commented 1 year ago

If reducing power consumption is a priority for you:

"The only value admitted (check the table) is 240, 160, and 80, but I tried with lesser value, and It seems. It works correctly (check the video with 40Mhz, 20Mhz, and 10MHz)."

https://www.mischianti.org/2021/03/06/esp32-practical-power-saving-manage-wifi-and-cpu-1/#CPU_management

fredlcore commented 1 year ago

0.2W less would still save 17 kWh per year, so about 7 Euros. Not that much, but also not nothing if there are no other gains. The rest of the time, BSB-LAN is mainly waiting. Either for some time interval to pass or for response from the bus. So apart from building up the category page even faster than it already is, there is nothing I can see that would benefit from running at full speed all the time.

fredlcore commented 1 year ago

I think I found this page you mention before at some point, but below 80 MHz, the WiFi became unreliable.

DE-cr commented 1 year ago

I guess I've found out why I had abandoned switching to 240 MHz for /D... only:

$ i=1; while [ $i -le 10 ];do echo -n "#$i: ";/usr/bin/time -f%E wget -qO/dev/null bsb-lan/D;let i=i+1;done
#1: 0:59.83
#2: 0:52.52
#3: 0:14.51
#4: 0:07.69
#5: 0:29.67
#6: 0:12.16
#7: 0:49.54
#8: 0:13.39
#9: 0:06.38
#10: 0:16.40

Wildly differing /D times, and in all cases worse than even with 80 MHz just set once on startup, sometimes very much so.

What's worse: after this test, I couldn't get OTA updates to work anymore.

I almost forgot: The timing test above was done with a datalog of <260 KB, which would have taken about 0.75 s for each try with just leaving the esp32 at 240 MHz right from startup.

fredlcore commented 1 year ago

Between 6 and over 50 seconds for requesting /D? That's so bizarre that I'll have to do a test here as well sometime. i could understand little variations or delays in case the internal system of the ESP32 needs to do something like starting services or so, but here it should just be changing the clock divisor. It's not like waking up from deep sleep or something like that...

DE-cr commented 1 year ago

To reproduce what I did:

            } else { //--- datalog
+             setCpuFrequencyMhz(240);
              dataFile = SD.open(datalogFileName);
              } else printToWebClient(PSTR(MENU_TEXT_DTO "\r\n"));
+             setCpuFrequencyMhz(80);
            }//datalog

(at about line 6350 / 6428 in BSB_LAN.ino, with https://github.com/fredlcore/BSB-LAN/pull/542 aleady in place)

DE-cr commented 1 year ago

"When one changes MCU clock frequency it alters all frequency dependent buses. For instance UART and I2C." (https://stackoverflow.com/questions/64600484/esp32-changing-clock-frequency-c)

Would that be okay in bsb-lan?

DE-cr commented 1 year ago

Anyway, just leaving out the setCpuFrequencyMhz(80); in setup() would be straightforward, and does work. Maybe make that a compile time choice for the user in BSB_LAN_config.h? I'll modify my PR accordingly.

DE-cr commented 1 year ago

0.2W less would still save 17 kWh per year, so about 7 Euros.

7 € looked a bit much to me, so I did the calculation myself: 0.2 W * 24 h * 365 == 1752 Wh == 1.7 kWh Personally, I'd gladly pay those extra 0.70 € per year. ;)

fredlcore commented 1 year ago

I cannot confirm that changing the clock speed on the fly is creating problems, and transfer times are approximately doubled (around 600, sometimes 800kB/s or even 1000kB/s) with a currently 80kB large log file:

frederik@MBP-FH ~ % i=1; while [ $i -le 10 ];do echo -n "#$i: ";/usr/bin/time -h wget -qO/dev/null 'http://bsb-lan/D';let i=i+1;done
#1:     0,18s real      0,00s user      0,00s sys
#2:     0,16s real      0,00s user      0,00s sys
#3:     0,16s real      0,00s user      0,00s sys
#4:     0,16s real      0,00s user      0,00s sys
#5:     0,16s real      0,00s user      0,00s sys
#6:     0,15s real      0,00s user      0,00s sys
#7:     0,16s real      0,00s user      0,00s sys
#8:     0,18s real      0,00s user      0,00s sys
#9:     0,16s real      0,00s user      0,00s sys
#10:    0,17s real      0,00s user      0,00s sys

The only (minimal) impact is that there is a little bit of garbage on the serial monitor when the clock speed changes. But as you can see with the broadcast telegram in between, bus traffic is not affected:

GET /4444/D HTTP/1.1                                                            
GET /4444/D HTTP/1.1                                                            
Df?\XW??K²j                                                                     
            GET /4444/D HTTP/1.1                                                
GET /4444/D HTTP/1.1                                                            
Df?\XW??K??j                                                                    
             GET /4444/D HTTP/1.1                                               
GET /4444/D HTTP/1.1                                                            
Df?\XW??K??j                                                                    
             GET /4444/D HTTP/1.1                                               
GET /4444/D HTTP/1.1                                                            
Df?\XW??KŠj                                                                     
            GET /4444/D HTTP/1.1                                                
GET /4444/D HTTP/1.1                                                            
Df?\XW??K??j                                                                    
             GET /4444/D HTTP/1.1                                               
GET /4444/D HTTP/1.1                                                            
Df?\XW??K??j                                                                    
             GET /4444/D HTTP/1.1                                               
GET /4444/D HTTP/1.1                                                            
D3?\XW??K‚j                                                                     
            HEIZ->ALL  INF 10368.0 Benutzerdefiniert - Unknown command: 36.6 °C 
78 11 FF 00 CC 02 00 14 02 15 00 02 2D 00 09 25 F2 ED                           
GET /4444/D HTTP/1.1                                                            
GET /4444/D HTTP/1.1                                                            
Df?\XW??K‚j                                                                     
            GET /4444/D HTTP/1.1                                                
GET /4444/D HTTP/1.1                                                            
Df?\XW??K??j                                                                    
             GET /4444/D HTTP/1.1                                               
GET /4444/D HTTP/1.1                                                            
Df?\XW??Kšj

I have added the following lines:

6268:
        if (p[1]=='D') { // access datalog file
          setCpuFrequencyMhz(240);
          if (p[2]=='0' || ((p[2]=='D' || p[2]=='J') && p[3]=='0')) {  // remove datalog file

and

6438:
          flushToWebClient();
          setCpuFrequencyMhz(80);
          break;

So all /D-related actions will benefit from the speed increase.

DE-cr commented 1 year ago

Have you been using WiFi?

fredlcore commented 1 year ago

No, I have been using LAN, but you are right, with WIFI it's really unreliable. I guess it's similar to the serial interface, only that WIFI is much more fragile. The garbage output on serial is actually the "Duration" debug message which one could do without and the garbage takes only a few characters (probably some kind of buffer which needs to be fully flushed). But with WIFI it's probably not so easy, even a delay(5000) didn't help much.

Disabling the 240 MHz just now made me realize that the transfer speeds are still in the 600kB/s range. So I guess the increase in speed is just due to the 4kB buffer and not due to the increase in clock speed? At least now with 80 MHz and a 135kB sized log file, I get these results:

#1:     0,33s real      0,00s user      0,00s sys
#2:     0,34s real      0,00s user      0,00s sys
#3:     0,31s real      0,00s user      0,00s sys
#4:     0,32s real      0,00s user      0,00s sys
#5:     0,29s real      0,00s user      0,00s sys
#6:     0,30s real      0,00s user      0,00s sys
#7:     0,31s real      0,00s user      0,00s sys
#8:     0,32s real      0,00s user      0,00s sys
#9:     0,30s real      0,00s user      0,00s sys
#10:    0,31s real      0,00s user      0,00s sys

compared to 240 MHz:

#1:     0,29s real      0,00s user      0,00s sys
#2:     0,26s real      0,00s user      0,00s sys
#3:     0,26s real      0,00s user      0,00s sys
#4:     0,24s real      0,00s user      0,00s sys
#5:     0,26s real      0,00s user      0,00s sys
#6:     0,25s real      0,00s user      0,00s sys
#7:     0,29s real      0,00s user      0,00s sys
#8:     0,27s real      0,00s user      0,00s sys
#9:     0,31s real      0,00s user      0,00s sys
#10:    0,26s real      0,00s user      0,00s sys

So the impact is rather negligible if not random, at least with LAN and SD-card. For WIFI and SPIFFS it might be different, but due to the limited file size, it won't matter on a day-to-day basis if the download takes 2 or 3 seconds and waste energy for that. I'll test WIFI and SD-card here as well but will wait for it until the log file is large enough to actually take several seconds.

DE-cr commented 1 year ago

With my system, I still get >20% faster /D times.

Here's another set of test runs (times [s], as usual):

A. Transfer to my laptop, right on top of the boiler:

       80 MHz 240 MHz   gain
------+------+-------+------
#1      0.50    0.42    16 %
#2      0.47    0.45    4 %
#3      0.59    0.49    17 %
#4      0.46    0.46    0 %
#5      0.53    0.40    25 %
#6      0.56    0.42    25 %
#7      0.51    0.44    14 %
#8      0.95    0.35    63 %
#9      0.45    0.35    22 %
#10     0.68    0.36    47 %
------+------+-------+------
min     0.45    0.35    22 %
max     0.95    0.49    48 %
avg     0.57    0.41    27 %

B. Transfer to my desktop PC, about 10 m and a couple of walls away:

       80 MHz 240 MHz   gain
------+------+-------+------
#1      1.86    0.51    73 %
#2      1.62    0.33    80 %
#3      0.50    0.28    44 %
#4      1.36    0.32    76 %
#5      0.65    0.24    63 %
#6      0.62    0.30    52 %
#7      0.42    0.31    26 %
#8      0.44    0.34    23 %
#9      0.43    0.30    30 %
#10     1.14    0.31    73 %
------+------+-------+------
min     0.42    0.24    43 %
max     1.86    0.51    73 %
avg     0.90    0.32    64 %

While the numbers do fluctuate, for me the impact is still worth 70 cents per year.

As I've returned to normal log settings, with a sample interval of 600 s currently, the numbers above are for a really small datalog of about 73 KB / 3500 lines only.

fredlcore commented 1 year ago

With a 1.5MB file and using LAN, it saves one second per download. With WIFI it's also around one second although transfer times are much worse: 240 MHz:

#1:     6,58s real      0,00s user      0,05s sys
#2:     6,75s real      0,00s user      0,04s sys
#3:     5,22s real      0,00s user      0,04s sys
#4:     6,77s real      0,00s user      0,05s sys
#5:     5,23s real      0,00s user      0,05s sys
#6:     7,12s real      0,00s user      0,05s sys
#7:     5,74s real      0,00s user      0,05s sys
#8:     7,23s real      0,00s user      0,05s sys
#9:     5,36s real      0,00s user      0,05s sys
#10:    6,92s real      0,00s user      0,05s sys

80 MHz:

#1:     8,46s real      0,00s user      0,06s sys
#2:     8,13s real      0,00s user      0,05s sys
#3:     6,76s real      0,00s user      0,06s sys
#4:     8,12s real      0,00s user      0,06s sys
#5:     6,38s real      0,00s user      0,05s sys
#6:     7,92s real      0,00s user      0,05s sys
#7:     6,40s real      0,00s user      0,05s sys
#8:     25,54s real     0,00s user      0,05s sys
#9:     11,34s real     0,00s user      0,07s sys
#10:    9,58s real      0,00s user      0,06s sys

I'll check again with a 10 MB file at a later stage. LAN download times are around 2.5 seconds (80 MHz) and 1.5 seconds (240MHz) btw. The WIFI router is just 1 meter away from the ESP32 :(.

fredlcore commented 1 year ago

BTW: Checking with the serial monitor running in parallel, I can see that the spikes in transfer times are due to the request being delayed due to BSB-LAN doing some kind of maintenance task (such as logging) or serving another request.

DE-cr commented 1 year ago

The only "maintenance task" I can imagine must be:

#ifdef ESP32
      esp_task_wdt_reset();
#endif

...whatever is in there.

This is a pattern I've just copied from transmitFile() (which is what /D uses) to my code handling /Da,b and /DKn.

fredlcore commented 1 year ago

I specifically meant BSB-LAN doing maintenance work such as logging, not the ESP32 core. So what happens is that a /D request reaches BSB-LAN when it is about to do the interval logging (which then takes some time) and only then takes care of the /D request. That explains the spikes in my case.

fredlcore commented 1 year ago

I have now included this as a web-configurable option, however when using LAN during energy saving active, BSB-LAN will go to full-speed during log output. There seems to be no negative impact in all of my testings here...

DE-cr commented 1 year ago

This had me wondering how many devices would get through the #if ... but then I remembered that Olimex is also an ESP32:

#if defined(ESP32) && !defined(WIFI)
        if (esp32_save_energy == true) {
          setCpuFrequencyMhz(240);
        }
#endif
DE-cr commented 1 year ago

I would have changed the comment here, as I do consider changes in /D times significant: setCpuFrequencyMhz(80); // reduce speed from 240 MHz to 80 MHz to reduce power consumption by approx. 20% with no significant loss of speed :)

fredlcore commented 1 year ago

Noted for next release