Set `build_cache_dir` in platformio to reuse compiled objects between updates (and devices)

LordMike commented 1 year ago

TL;DR: This issue is a journey I took to find out how to improve caching. Skip to this comment where I used build_cache_dir in platformio to improve build speeds.

Describe the problem you have/What new integration you would like I've been managing a few identical devices (apart from their identity) for a little while now, and I've been following along when it's been compiling and I see the same code being compiled again and again. This makes sense of course, because it's simpler to see each device as a completely separate unit.

I imagine that in my and many other instances though, at least parts of these files will be identical. It is my impression that the device config is used to substitute values into the source (cpp) files, which is then compiled. This could be f.ex. the wifi configuration, which would (at minimum) contain an SSID to connect to. This SSID is the same though, on all my devices. So I'm thinking it should be possible to reuse the compiled output (the .o file). Note: I've since found that the SSID/Wifi config is actually in main.cpp, which just makes the savings greater for the wifi library, as it will be identical "always".

If I'm right in assuming the .cpp files are modified with the values they need (from the device config), it should be possible to do something like:

Substitute in the values needed in the .cpp file
Hash the .cpp file, f.ex. using md5 or something less intensive like xxhash
Check if a file exists at /somedir/output/HASH.o
If the file exists, copy it in to the output, where the output would normally go. We know this output is identical, to what we would have produced, as the source input is identical
If the file does not exist, compile the .cpp, and copy the output to /somedir/output/HASH.o
Proceed as normal, perfom linking as normal, build a firmware as normal - this should work, as all compiled output will still be in the place it usually is.

The temporary/cache location can be cleared at will - if it's empty, the sources are compiled again and placed there. Likewise, if there are access timestamps, you can remove the least recently used files in order to keep a rolling set of used files.

There are some drawbacks that I can think of:

Caching the output based on the source input will work for all instances where the output is deterministic. If the output isn't, it doesn't "work" - this could be if the compiler changes.
- If the compiler changes, it's likely we get a new esphome addon served. So one way to handle this is to let the cache directory be an ephemeral location, like a path that isn't mapped to a persistent volume. This way the cache is cleared on each esphome addon update.

Please describe your use case for this integration and alternatives you've tried: N/A

Additional context I browsed to my /data directory on my esphome addon and checked some of the compiled files. I just grabbed two from the list and found this. I'm in the middle of an update across all devices.

# find -name 'api_connection.cpp.o' -exec md5sum {} \; | sort
0a6a67845634bf12c0e6c1aea62d94d4  ./light-44-2/.pioenvs/light-44-2/src/esphome/components/api/api_connection.cpp.o
0a6a67845634bf12c0e6c1aea62d94d4  ./light-extra01/.pioenvs/light-extra01/src/esphome/components/api/api_connection.cpp.o
3e450948cd55c7283372197e388d7da5  ./athom-rgbct-light-dfb640/.pioenvs/athom-rgbct-light-dfb640/src/esphome/components/api/api_connection.cpp.o
3e450948cd55c7283372197e388d7da5  ./athom-rgbct-light-dfb727/.pioenvs/athom-rgbct-light-dfb727/src/esphome/components/api/api_connection.cpp.o
66eeb0ca149319f7690ea22dd02efe73  ./athom-relay-board-x2-4dab15/.pioenvs/athom-relay-board-x2-4dab15/src/esphome/components/api/api_connection.cpp.o
66eeb0ca149319f7690ea22dd02efe73  ./relay-40-4/.pioenvs/relay-40-4/src/esphome/components/api/api_connection.cpp.o
a01245756f2737b7213a88c97faccfe0  ./light-1-5/.pioenvs/light-1-5/src/esphome/components/api/api_connection.cpp.o
a01245756f2737b7213a88c97faccfe0  ./light-41-1/.pioenvs/light-41-1/src/esphome/components/api/api_connection.cpp.o
a01245756f2737b7213a88c97faccfe0  ./light-41-2/.pioenvs/light-41-2/src/esphome/components/api/api_connection.cpp.o
a01245756f2737b7213a88c97faccfe0  ./light-41-3/.pioenvs/light-41-3/src/esphome/components/api/api_connection.cpp.o
a01245756f2737b7213a88c97faccfe0  ./light-41-4/.pioenvs/light-41-4/src/esphome/components/api/api_connection.cpp.o
a01245756f2737b7213a88c97faccfe0  ./light-41-5/.pioenvs/light-41-5/src/esphome/components/api/api_connection.cpp.o
a01245756f2737b7213a88c97faccfe0  ./light-41-6/.pioenvs/light-41-6/src/esphome/components/api/api_connection.cpp.o
a01245756f2737b7213a88c97faccfe0  ./light-44-1/.pioenvs/light-44-1/src/esphome/components/api/api_connection.cpp.o
b46b335dc1eeaed4f0f5b61134da47a5  ./athom-front-1/.pioenvs/athom-front-1/src/esphome/components/api/api_connection.cpp.o
b46b335dc1eeaed4f0f5b61134da47a5  ./athom-front-3/.pioenvs/athom-front-3/src/esphome/components/api/api_connection.cpp.o
d9b8168af57fe60127898024c3d16ebd  ./shelly-1e5c3c-7228/.pioenvs/shelly-1e5c3c-7228/src/esphome/components/api/api_connection.cpp.o
e8ca504e2e9920ddc904110c637ea0c6  ./light-43-1/.pioenvs/light-43-1/src/esphome/components/api/api_connection.cpp.o
e8ca504e2e9920ddc904110c637ea0c6  ./light-43-2/.pioenvs/light-43-2/src/esphome/components/api/api_connection.cpp.o
e8ca504e2e9920ddc904110c637ea0c6  ./light-43-3/.pioenvs/light-43-3/src/esphome/components/api/api_connection.cpp.o

root@5c53de3b-esphome:/data# find -name 'proto.cpp.o' -exec md5sum {} \; | sort
aef2b21dd7852f77ffe9725946b561e2  ./athom-front-1/.pioenvs/athom-front-1/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./athom-front-3/.pioenvs/athom-front-3/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./athom-relay-board-x2-4dab15/.pioenvs/athom-relay-board-x2-4dab15/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./athom-rgbct-light-dfb640/.pioenvs/athom-rgbct-light-dfb640/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./athom-rgbct-light-dfb727/.pioenvs/athom-rgbct-light-dfb727/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-1-5/.pioenvs/light-1-5/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-41-1/.pioenvs/light-41-1/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-41-2/.pioenvs/light-41-2/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-41-3/.pioenvs/light-41-3/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-41-4/.pioenvs/light-41-4/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-41-5/.pioenvs/light-41-5/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-41-6/.pioenvs/light-41-6/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-43-1/.pioenvs/light-43-1/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-43-2/.pioenvs/light-43-2/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-43-3/.pioenvs/light-43-3/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-44-1/.pioenvs/light-44-1/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-44-2/.pioenvs/light-44-2/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./light-extra01/.pioenvs/light-extra01/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./relay-40-4/.pioenvs/relay-40-4/src/esphome/components/api/proto.cpp.o
aef2b21dd7852f77ffe9725946b561e2  ./shelly-1e5c3c-7228/.pioenvs/shelly-1e5c3c-7228/src/esphome/components/api/proto.cpp.o

So here, we see the hashes of the compiled output of two files: api_connection.cpp and proto.cpp. In the first case, there are 5 distinct versions right now across 14 devices (plus a few I have renamed). For Proto.cpp, there is just the one output.

In this case, if the output was cached, I could potentially save 13 compilations of the proto.cpp file.

LordMike commented 1 year ago

I've found that the actual compilation is handled by platformio, which means this idea either has to:

Use features in platformio to share the output. But as I understand it, each esphome device is a separate environment, and as such it isn't advisable to share the build outputs.
Before running platformio, do the hashing and copying of known outputs

I've also realized that the reason there are different environments is due to the different target platforms (esp82xx and whatnot). I think this is still manageable, by making the cache location dependent on these settings. If we know the list of settings that could change the build output, we can hash those together and add it to our cache directory, something like this:

cache_dir = "/esphome_cache/"
cache_dir += hash(settings.esp8266) + "/"

This way, anything changed in the esp8266: yaml section will produce different cache directories.

I've found that there is a platformio config for each device, so perhaps the cache_dir can be set here, given the hash above.

LordMike commented 1 year ago

I've hacked the writer.py to set build_cache_dir = /esphome_cache/ in the platformio.ini. I found that it's generated every time I build, so I couldn't just edit the file. I'll try building two identical devices, and see if it works out.

-- later

From reading a discussion, it seems that maybe PIO will automatically handle different platforms and build arguments, such that the cache can freely be used between different builds and it'll just do the right thing. Maybe this is as simple as just setting the cache directory.

-- later

So my testing shows that for two identical devices (a light), the cache did not help between them. Even if the board, settings, everything but the name/api secret were identical, the build_cache_dir did not take effect (or it did, but it wasn't used between the two).

What did work though, was when I cleaned a previous builds output and built it again. Then the cache was used. So at least there, there will be some gains. I imagine this also works if there is an update from some upstream that would normally force a complete rebuild, the cache can help here.

It seems that for caching between environments (between devices) to work, platformio's caching mechanism needs to ignore the environment name. I'm not sure if doing this would break anything, because it seems that the full commandline they actually run to build stuff, will be enough.

LordMike commented 1 year ago

I've created an issue with platformio to improve the build_cache_dir to work cross-environment.

I hope esphome will add the build_cache_dir to esphome platformio.ini's, as this setting is useful regardless of whether it can be used across devices currently. A future update to platformio will hopefully make the cache even better :).

LordMike commented 1 year ago

Update: The platformio issue has been closed with no change currently, as it probably needs a much smaller reproducible example. So I'll have to figure out how to make that eventually.

But the crux of it (https://github.com/platformio/platformio-core/issues/4574#issuecomment-1475018136) is that the underlying SCons system (which actually does the build if i understand correctly) does the hashing and determination to use the cache directory and files. It however seems to include the file path in these hashes, which means that two environments in esphome can never produce the same hash, as they're placed in each their own directories.

rgriffogoes commented 1 year ago

That's a bummer! I was hoping to find a way to reduce the time needed to updated multiple devices (my ESPHome dashboard runs in a VM in my NAS, not so much compute power).

Hopefully platformio and/or SCons will eventually do something about it (I would imagine this improved caching mechanism would also benefit non-ESPHome cases). Thanks for the evaluation!

For now, I'm booting up a temporary ESPHome dashboard on a more powerful device, mounting same config file folders. So far so good!

petervk commented 11 months ago

Would love something like this to be implemented as I have 18 Sonoff S31s with almost identical configurations that take a long time to update as they have to compile every single file from scratch.

LordMike commented 10 months ago

I just saw @TMaYaD mention he had ~200 devices in #5821. That must be a pain to update :|

TMaYaD commented 10 months ago

I have my own wrapper around esphome that runs on kubernetes as a cronjob and automatically reconciles devices. All device configs are also tracked on a repo, a la. devops.

LordMike commented 10 months ago

Oh wow - that's neat. Any chance this will make its way to the core esphome?

I've been wondering about a "version" as well, being able to determine that the local config is the same as the device's, as to avoid deploying no-ops. This seems to handle this as well :)

expaso commented 10 months ago

Haha Nice Wrapper @TMaYaD! Finally some real steps to manage all the updates of all the ESPHome devices. I like it!

But 200 devices, dude! Is your Wifi still stable?

LordMike commented 6 months ago

Just wanted to pipe in again that I made a node-red flow that updates my nodes one by one. It takes a while, but it does complete eventually. It works by listing all the update entities in HA of the esphome kind that have an update and then looping through them, waiting for their completion. Once done, it'll notify me.

Node red flow

```json [{"id":"0f02093ca01442aa","type":"comment","z":"938136b3.0c0f48","name":"Update esphome","info":"","x":400,"y":1140,"wires":[]},{"id":"476bfedb04779a9a","type":"function","z":"938136b3.0c0f48","name":"Logic","func":"const fn_updateSingleDevice = context.get('fn_updateSingleDevice');\n\n// Get all esphome nodes\nconst states = global.get('homeassistant.homeAssistant.states');\nconst filteredKeys = Object.keys(states)\n .filter(key => /^update\\..*_firmware$/.test(key))\n .filter(key => key in states)\n .filter(key => {\n let state = states[key];\n // filter to esphome & update available\n return state.state === \"on\" && state.attributes.title === \"ESPHome\";\n });\n\n// For each node, emit an update\nfor (const key of filteredKeys) {\n await fn_updateSingleDevice(key);\n}\n\nif (filteredKeys.length > 0)\n return [null, {updated: filteredKeys}];\n\nnode.status({ fill: \"green\", shape: \"dot\", text: `updated ${filteredKeys.length} devices` });\n","outputs":2,"timeout":0,"noerr":0,"initialize":"const statesDict = global.get('homeassistant.homeAssistant.states');\n\nfunction sleep(ms = 1000) {\n return new Promise(resolve => setTimeout(resolve, ms));\n}\n\ncontext.set('fn_updateSingleDevice', async function(key) {\n if (!statesDict[key].attributes.in_progress)\n {\n node.send([{\n entity_id: key\n }]);\n }\n else {\n node.warn(`Entity ${key} is already updating, not triggering, but waiting`);\n }\n\n while (statesDict[key].state !== 'off') {\n let state = statesDict[key];\n\n if (state.attributes.in_progress)\n node.status({ fill: \"yellow\", shape: \"dot\", text: `updating ${key}` });\n else\n node.status({ fill: \"yellow\", shape: \"ring\", text: `waiting for update ${key}` });\n\n await sleep(5000);\n }\n\n node.status({ fill: \"green\", shape: \"dot\", text: `updated ${key}` });\n});","finalize":"","libs":[],"x":390,"y":1240,"wires":[["192dbb924055cb60","4d89afc7d9cb0b39"],["67930ba63f8a94c2"]]},{"id":"4e3b95b38acf5c74","type":"inject","z":"938136b3.0c0f48","name":"Trigger","props":[],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","x":230,"y":1240,"wires":[["476bfedb04779a9a"]]},{"id":"192dbb924055cb60","type":"debug","z":"938136b3.0c0f48","name":"Trigger update","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"true","targetType":"full","statusVal":"","statusType":"auto","x":620,"y":1180,"wires":[]},{"id":"4d89afc7d9cb0b39","type":"api-call-service","z":"938136b3.0c0f48","name":"Trigger update","server":"af18237.61268e","version":5,"debugenabled":false,"domain":"update","service":"install","areaId":[],"deviceId":[],"entityId":["{{entity_id}}"],"data":"{}","dataType":"jsonata","mergeContext":"","mustacheAltTags":false,"outputProperties":[],"queue":"none","x":620,"y":1220,"wires":[[]]},{"id":"67930ba63f8a94c2","type":"api-call-service","z":"938136b3.0c0f48","name":"","server":"af18237.61268e","version":5,"debugenabled":false,"domain":"notify","service":"notify_michael","areaId":[],"deviceId":[],"entityId":[],"data":"{\t \"message\": \"Updated \" & updated,\t \"title\": \"Esphome updates completed\"\t}","dataType":"jsonata","mergeContext":"","mustacheAltTags":false,"outputProperties":[],"queue":"none","output_location":"","output_location_type":"none","x":640,"y":1280,"wires":[[]]},{"id":"af18237.61268e","type":"server","name":"Home Assistant","version":5,"addon":true,"rejectUnauthorizedCerts":true,"ha_boolean":"y|yes|true|on|home|open","connectionDelay":true,"cacheJson":true,"heartbeat":false,"heartbeatInterval":"30","areaSelector":"friendlyName","deviceSelector":"friendlyName","entitySelector":"friendlyName","statusSeparator":"at: ","statusYear":"hidden","statusMonth":"short","statusDay":"numeric","statusHourCycle":"h23","statusTimeFormat":"h:m","enableGlobalContextStore":true}] ```

esphome / feature-requests

Set `build_cache_dir` in platformio to reuse compiled objects between updates (and devices) #2171