jamesmunns / postcard-rpc

An RPC layer for postcard based protocols
Apache License 2.0
48 stars 11 forks source link

Out-of-band management protocol #27

Open si14 opened 3 weeks ago

si14 commented 3 weeks ago

I'm (well, will be) shipping devices and host software to customers. Both the firmware and the host software are being actively developed. Here are two scenarios that worry me a lot:

In both cases, it'd be really useful to have a simple out-of-band management protocol, the simpler the better. The details are microcontroller-dependent, but at least for RP2040 a single SDK call (reboot to stage 1 bootloader) would be sufficient and really hard to get wrong. It can be completely out-of-band (a separate USB endpoint? using the control endpoint?), or it can just be a magic value in the existing protocol.

There might be more than one layer to this protocol. I can imagine a two-layer protocol, with "reboot into firmware update mode" being the simplest, dumbest, least likely to break layer, and something like "please return the current version/the last panic message/etc" being a bit more complex, but still more reliable than user code.

Either way, I think having something like this in Postcard RPC would be extremely useful. Let me know what you think!

jamesmunns commented 3 weeks ago

Note that the way that postcard-rpc is written, the "breaking changes" are granular to each endpoint, NOT the whole version. If you only change one API's types, then only that API is broken.

There may still be breaking "base protocol" changes until postcard-rpc 1.0. This is something that you'll have to manage if you begin shipping devices before then.

I do have some plans to offer .well-known endpoints, for example "give me all the endpoints/topics you support". This could be used host side for version detection. My recommendation would be to have the host "understand" many versions worth of endpoints, and each device only supports its own.

Similarly, you can write your own stable "get version" or "start bootload" APIs.

what if I wrote a bugged firmware upgrade RPC handler

I'd strongly suggest doing integration testing.

There might be more than one layer to this protocol.

At the moment, I'd like to keep postcard-rpc very dumb and general at it's core. I'd welcome "by convention" patterns/cookbooks, but I'd like to separate "what postcard-rpc does" from project's actual application/business logic.