Pluggable discovery - Githubissues

arduino / tooling-rfcs

RFCs related to Arduino tooling projects

Creative Commons Zero v1.0 Universal

5 stars 4 forks source link

Pluggable discovery #2

Closed cmaglie closed 3 years ago

cmaglie commented 3 years ago

This document describes how the Pluggable Discovery works and how it should integrate with monitors and uploaders.

When a sketch is uploaded to an Arduino board the only way for transferring the binary executable to the microcontroller is through the serial port. Currently:

the subroutines to enumerate the serial ports and to possibly identify a board are “hardcoded” in the IDE/CLI
the only way to identify a board is via USB VID/PID identifiers
the "serial monitor", to communicate with the board after the upload, is “hardcoded” in the IDE

The current structure does not allow to use different kind of “ports” to communicate with the microcontroller. For example it would be interesting to have the possibility to:

discover a board (MDNS) and upload via network (OTA)
discover a board and upload via a communication port that is not the serial port (for example via USB-HID as implemented by Teensy or via any other kind of port)
allow a third party to implement his own upload protocol / port

The pluggable discovery aims to provide a solution to these problems.

Please check section https://github.com/arduino/tooling-rfcs#5-review-and-revision the RFC is officially open for review so comments are welcome!

PaulStoffregen commented 3 years ago

A few random comments...

In event mode, discovery may detect changes after the port is initially added. As implemented in 1.8.13, discovery may send another "add" message with the same unique address. The Java IDE treats this repeated add message as replacing the previous discovery info for that address. The meaning of repeated "add" should be documented, or if transmitting another "add" message isn't allowed, that should be specified. Maybe some other message like "replace" chould be defined in that case?

The Java IDE allows discovery to respond with an error to START_SYNC, indicating it does not support event mode. As this draft is worded, event mode seems to be mandatory. Either way is fine as far as I am concerned. As a specification, if any portion of the protocol is optional, which features a discovery utility must implement versus may implement should be clear.

In the case of duplicate addresses, labels or other conflicts, perhaps consider the selected board and give preference to info from the discovery its package provided.

A recipe and maybe discovery metadata need to be defined for discovered hardware to communicate with the serial monitor.

For hardware supporting debugging, maybe recipes and metadata would also be needed to set up a debug session?

cmaglie commented 3 years ago

Hi @PaulStoffregen, thanks for the feedback!

In event mode, discovery may detect changes after the port is initially added. As implemented in 1.8.13, discovery may send another "add" message with the same unique address. The Java IDE treats this repeated add message as replacing the previous discovery info for that address. The meaning of repeated "add" should be documented, or if transmitting another "add" message isn't allowed, that should be specified. Maybe some other message like "replace" should be defined in that case?

👍🏼 I'll add a note about this particular use case (multiple add for the same address). I don't think adding another replace message is worth it.

The Java IDE allows discovery to respond with an error to START_SYNC, indicating it does not support event mode. As this draft is worded, event mode seems to be mandatory. Either way is fine as far as I am concerned. As a specification, if any portion of the protocol is optional, which features a discovery utility must implement versus may implement should be clear.

The discovery Is supposed to implement the START_SYNC, I'll add a note about that. Even in the worst case that an event-mode is not doable, I prefer to be the discovery to "emulate" a START_SYNC with an internal polling loop.

In the case of duplicate addresses, labels or other conflicts, perhaps consider the selected board and give preference to info from the discovery its package provided.

👍🏼 added a note

A recipe and maybe discovery metadata need to be defined for discovered hardware to communicate with the serial monitor. For hardware supporting debugging, maybe recipes and metadata would also be needed to set up a debug session?

Are you referring to pluggable monitors, I guess? I'm going to create a separate RFC for pluggable monitors, I wanted to leave them out from this one to not overcomplicate it.

To answer your question: yes, the idea is to pass this information not only to upload.* recipes but also to monitor.* (TBD on another RFC) and to the debugger, even if it's not explicitly mentioned in this specification it doesn't mean we will not do it later.

About debugging, we are iterating through various options (we already tried to encode the tooling setup into a recitpe.debug=... in platform.txt but it turned out to be quite a nightmare, so we ended up with a configuration-based approach in the current version of the cli). BTW the information from the discovery will be surely made available to the debbuger in a way or another.

PaulStoffregen commented 3 years ago

@cmaglie - Is it too soon to talk of Pluggable Serial Monitor? Is there any chance both Pluggable Discovery and Pluggable Serial Monitor can happen before the official 2.0 IDE release? Both are needed for Teensy to work. My patches to the Java code add a very rudimentary Pluggable Serial Monitor using stdin / stdout and status updates via stderr. If 2.0 will use localhost sockets or some other IPC method, I'd really like to start moving my code in that direction. Ideally, I'd love to see a 2.0 beta with these features, so I can offer users a way to test Teensy with your beta versions before the 2.0 release.

ubidefeo commented 3 years ago

@PaulStoffregen Pluggable Monitor is definitely in the cards, we have started talking about it but we don't have an RFC yet. Once we're done with Discovery I guess the next logical step is the Monitor, because it can be used to begin implementing other tools such as Serial Plotter. I'll let @cmaglie keep control over these implementations, I'm just happy if we can make this new IDE more flexible over time. What I can tell you is that definitely these two features and a couple more need to be brought to life before a final 2.0 is out ;)

matthijskooijman commented 3 years ago

I just gave this proposal an initial read-through, and I'm quite thrilled about it. There has been quite some limitations for boards that deviated from the default serial-port-uploader scheme (i.e. DFU, SWD, etc.) that I think can be neatly solved by this (though I haven't fully thought these through yet).

While reading this proposal, I noted down some questions, suggestions and confusions, see below. It's quite the list and a bit of a braindump here and there, but I hope it will help fuel further thoughts and discussion even though there's mostly questions and not much proposals for improvements in there :-)

I'll try to let this proposal sink in a bit further and see how it holds up for some specific usecases, so hopefully (if I don't get swamped by other work and projects :-p) I can follow up with some more later.

Anyway, here's my wall of text, enjoy!

Regarding the stdin/stdout protocol of discovery tools:

It's not immediately obvious how commands are formatted. Reading closely, they are just the command strings themselves (presumably terminated by a newline? Or CRLF?). This does mean that commands cannot have parameters without messy parsing, so maybe commands should just be JSON strings as well? Even if we do not need parameters now, sending e.g. {"command": "START"} might be more extensible.
Why are commands uppercase and the corresponding eventtypes lowercase? Wouldn't it make sense to make them the same?
For START_SYNC it says "After calling START_SYNC an initial burst of add events may be generated in sequence to report all the ports available at the moment of the start.". This makes sense, since it allows getting a complete picture of the available ports without a race condition (if you would need to call LIST first and then START_SYNC). However, this says "may be generated", shouldn't this be mandatory (to prevent having a limbo situation where you need to call LIST for some discoverers but not for others)? Also, should these intitial add events maybe be marked as such so callers can, if the need to, distinguish between real new ports and pre-existing ports (without relying on some heuristic such as "within x ms of sending the START_SYNC command)? Additionally, would it be useful for callers to know when the initial burst of ports is complete and when the waiting for changes starts? I can imagine that a tool wants to know this to e.g. get wait for an initial complete list first and display that, rather than starting with an empty list that quickly populates (which might not be ideal in UI terms). I guess I'm wondering if START_SYNC should maybe start with a list reply rather than a burst of initial add events?

Then, some thoughts on matching port to boards:

Which ports and boards are considered and which prefs are available to match? I suspect that for ports returned by a given platform's discovery tools, only that platform's boards are considered. For builtin discovery tools, I guess all boards are checked for matching prefs? Or would it make sense to have a platform explicitly reference any builtin discovery tools that it needs (maybe without specifying a recipe, but just referencing them by name)? Going further, it might even be useful for a platform to explicitly include discovery tools from another platform (i.e. by id, not by repeating the recipe), similar how a platform can reference another core, or (implicitly) reuses programmers from a referenced platform? In any case, this does require that builtin discoveries have a very well-defined set of identificationPrefs, which cannot be lightly changed later. For platform-specific discoveries, this is not so strict, since the set of prefs reported by the tool and matched by the platform is just a detail internal to the platform.
Must all preferences listed in identificationPrefs be matched? Or just the ones common to the port and board?
There is an example of identificationPrefs in the proposal:
```
 myboard.pears.0=20
 myboard.apples.0=30
 myboard.pears.1=30
 myboard.apples.1=40
```
Which is stated to match pears=20, apples=30 and pears=30, apples=40. But will this also match pears=20, apples=40 (i.e. mixing index 0 and 1), or are these indexes required to match? If the latter, how about when some prefs are indexed multiples and some are not?
Would it make sense to make the prefs used to match against identificationPrefs explicit in boards.txt, e.g. using arduino_zero_edbg.port_match.vid.0=0x03eb or so (probably better name than port_match)? This reduces the chances of accidental false positive or false negative matches on prefs that are not actually intended for matching but just happen to have the same name (especially when new identificationPrefs are added to builtin discoveries later). Such a structure could also be used to disambguate the previous two points, e.g. by having something like:
```
myboard.port_match.0.vid=0x1234 # Match just one particular vidpid
myboard.port_match.0.pid=0x5678
myboard.port_match.1.vid=0x1234 # Match *all* boards with this vid
myboard.port_match.2.vid=0x0000 # And all boars with this vid that also have foo=bar
myboard.port_match.2.foo=bar
```
When matching ports to boards, must available upload protocols for a board be considered? Or is a port matched to a board regardless of available upload protocols and are those only used for subsequently filtering ports available for upload?

Then some more misc thoughts, mostly about the way ports are described and used:

Can a port implement multiple protocols? I can imagine that some USB devices have multiple interfaces that could implement different protocols. If such different protocols can result in different addresses, these could of course be treated as different ports as well (and I guess the address could also be made unique, even if the upload tools for both protocols require the same port name, which can then just be added to prefs and use that in the recipe in place of the uniquefied address). Reading on, it also seems that when a user selects a port for upload that has multiple protocols, it might become ambiguous which upload tool to use, so a single protocol per port probably makes most sense.
In the port properties, reading the LIST command output documentation, I initially thought that prefs and identificationPrefs seemed to generic and did not seem to reflect what I initially understood as their meaning (i.e. info about one particular instance of a board and info about a type of board). However, reading on, I've found that these two properties have different meanings, for which their name actually makes more sense. AFAICT prefs is just an arbitrary list of preferences that is usable in the upload tools recipes (and, AFAICT will not be used by the tooling in any way?). Maybe uploadPrefs might slightly better reflect this? Then identificationPrefs contains info on the board model that this port belongs to, intended to be matched against board preferences to match board definitions against this port (i.e. to identify the board belonging to this port, so I guess the name is actually ok).
In the constraints section, it says "Each port may provide metadata to identify a specific instance of the board (serial number / MAC address)", which I think refers to the prefs attribute? This initially made me thought that the prefs attribute is actually exclusively info on the board instance (implying that two different ports belonging to the same board instance would have identical prefs), but it seems that prefs is really metatadata on a specific instance of the port (which is somewhat equivalent to a specific instance of a board for single-port boards). So maybe this should be rephrased somehow to clarify this.
Maybe some explicit mention should be made of what constitutes the identity of a port. In particular, I think that address now constitutes the identifier of each port and should be unique (re-reading, this is already mentioned in the constraints section). Or is it protocol + port (this ties into the multiple protocols per port question, I guess)? This is also relevant for correlating add and remove events. And should such identify/address be globally unique, or only within a given discovery tool? How to deal with two discoveries returning the same address?
Do we need a way to correlate multiple ports belonging to the same board instance? The constraints section talks about "specific instance[s] of the board", which lead me to believe that a board instance is somehow an explicit concept in this proposal, but reading more carefully, this concept does not seem to exist at all. For the purposes of this proposal, a board instance that has two ports is effectively the same as two separate board instances (that maybe have same serial or so). Would it be valuable to add this concept somehow, e.g. by adding one ore more "board instance identifiers" to a port, that can be used to group multiple ports belonging to the same board instance? This might also need to be specified to work across different discovery tools, e.g. a board might have a serial port, but also an USB DFU endpoint, which I think would be discovered by different tools, but should probably be correlated to belong to the same board instance (this helps for the serial monitor extension later, but correlating the serial port with the USB endpoint can also maybe help the user distinguish multiple similar boards).
Platforms can define custom discovery tools, but when should these run? When any board from such platform is selected? Or should tools be connected to specific boards? When doing a generic port list (e.g. using arduino-cli or so), I guess all tools from all platforms should be ran?
Should we make more details available for each port? I.e. for a USB-serial port, in addition to the vid-pid, you could more specifically trace the port to a USB configuration index, interface index, maybe even endpoint index. I don't think such info belongs in identificationPrefs (since this is really additional info on the port's location or identity, not on the board it is connected to). I'm also not sure if this is at all relevant for serial ports, but e.g. for DFU uploads, different tools can accept different ways to identify a particular port/device. For example dfu-util allows selecting a device based on its serial number, USB devnum or USB path (i.e. 1-1.5.4.1), openocd seems to use just the USB path, while STM32 cube programmer uses its own set of identifiers (but also seems to support serial numbers). Some of these might be specific on just one platform or have different meanings on different platforms (e.g. the sysfs/udev path of a (USB) device is also used to identify devices on Linux sometimes). Now I better understand how prefs is used, I guess all of these identifiers are just additional info about a particular port and could be put into prefs, which also allows using them in recipes. It would be good if we could have some guidance about such properties, so different discovery tools would use the same names and same formats for these things were possible.
Historically, there was a serial.port=ttyACM0 and serial.port.file=/dev/ttyACM0, but the examples suggest that now only /dev/ttyACM0 is returned by the discovery tool. Should the short form also be returned by the discovery tool for upload tools that need it (and do not want to rely on backward compatibility variables)? I guess this is just a very specific instance of the previous point.
In the examples a protocol network is mentioned, but I wonder if that is not way too generic? To really return a meaningful list of network devices, some selection based on e.g. available services is needed? Or maybe just listing all devices discovered by e.g. mdns and let the user sort out the right ones could work (or maybe any advertised service names or so could be added to identifcationPrefs, or using macprefixes seems to be a strategy from the examples?)
I think it would be helpful if some examples of discovery output and recipes would be done for a few more protocol/port types (I'm thinking of DFU and EDBG in particular). This helps understand things, and also helps exposing flaws in the proposal where it fails to express things we'll be needing.
How does all this relate to upload using programmer? In a way, I guess a programmer could be treated as a board by itself, with its own prefs (e.g. vid/pid) matching against available ports, etc. but I haven't thought this through.

Finally, responding to an earlier comment:

Pluggable Monitor is definitely in the cards, we have started talking about it but we don't have an RFC yet.

It would be good to at least think about this a bit, since I guess that serial monitor would typically also have ports connected to boards, and some of these ports might overlap with the upload ports (e.g. with most standard serial-based bootloader uploading). So maybe serial monitor could simply be one of the protocols supported by a port.

matthijskooijman commented 3 years ago

I already gave this some more thought over dinner. First off, here's a few usecases that I think would be worthwhile to consider:

The Arduino Leonard or similar. These are boards that normally expose a serial monitor port, which is given a 1200bps touch, and then resets into the bootloader (which involves re-enumerating the USB device). The bootloader then offers a new serial port, through which the upload happens.
The Arduino zero, which offers two ways to upload: One is through the embedded EDBG chip, which is essentially just an integrated programmer. The other is through the native USB connection with bossac, though I have to admit I'm not entirely sure how this one works (it does seem to use 1200bps touch and a serial port).
STM32duino-based boards using DFU upload. These use 1200bps touch to reset into the ROM bootloader, which re-enumerates as a DFU device. The upload tool (STM32CubeProgrammer currently, but it would be nice to (also) support openocd) then handles the DFU upload. This currently just uses the first available DFU device, which might not be the right one.
STM32duino-based boards using DFU and DFU auto-reset. I've been using this in my fork and hope to get this merged as part of https://github.com/stm32duino/Arduino_Core_STM32/pull/710. When the sketch is running, it exposes another USB interface (in addition to ACM serial) for DFU that supports a reset command. This command resets into the bootloader, which then accepts the firmware over the full DFU protocol. Both the reset and upload can be handled by dfu-util.
STM32duino-based boards using mass storage upload. This just copies a file into a (to be) mounted directory. The current script for this ignores the serial port completely and instead just looks at the filesystem label, which contains the particular board name.
STM32duino-based boards using HID bootloader. These are reset into the bootloader using some twiddling of the serial port settings, I believe, and after that re-enumerate as a HID device for the upload. Both the reset and upload is handled by the upload tool, so from the perspective of the Arduino tooling, this is a simple upload (just pass it a serial port and it will upload). There's one exception, though: When the main sketch breaks and the board gets stuck in bootloader mode, I'm not exactly sure how that would work with this tool.

There's a lot of STM32duino in there, mostly because I have been working with it a lot, and because it uses a lot of different upload methods (there's a bunch more that I haven't even mentioned).

Matching ports after bootloader reset A lot of these boards use the 1200bps touch to reset into the bootloader. What happens then varies:

For the Leonardo, it reenumerates as a serial port (but it might not have the same address, so I think currently arduino-cli has some semi-hardcoded heuristics to find the bootloader serial port after re-enumeration, but I think this is currently still just looks for any port that appears at the right time (though there was some work on improving this, haven't checked what it changes exactly).
For the DFU versions, something similar happens, but since there is no serial port after reset, arduino-cli does not find a post-reset port but just calls the tools, which just upload to the first available DFU device (possibly limited by USB vidpid), which is imperfect.
I'm not quite sure for bossac and the Arduino zero.

In all these cases, it would be nice if the tooling could deterministically the post-reset upload port.

One complication is that in some cases, the type of port changes (e.g. from serial to DFU), which could mean we need to match ports between different discovery tools, or that e.g. the DFU discovery tool must also return serial ports so it can stay in control of both. Neither seem ideal, though.

The other question is how to match the ports before and after. In these cases, matching based on the USB path (i.e. the chain of ports and hubs, such as 1-1.5.4.1) seems an obvious choice, though it might not be available (or easy to figure out) on all platforms. Ideally, though, this would of course not be some fixed property, but different discoverers could offer different identifiers to correlate ports or even multiple identifiers (I can imagine the builtin serial discovery would return serial number, USB path, usb devnum, and maybe other properties that can be used to match).

Then should we somewhere declare the protocols of the port before reset and after? OTOH, the port before is necessarily a serial port (to allow 1200bps touching), so maybe that can be hardcoded (just like how 1200bps touch works). Then the boards.txt can just declare the protocol after reset, maybe? And some other property to declare which pref should be used to match the before-reset-serial port to the after-reset-other-protocol-port? Or, another approach could be to turn the upload in a two-step process, involving two upload tools: One that just does the 1200bps touch, declares how to find the post-upload port and which upload tool to use after the reset? That seems like it could turn out elegantly (and maybe even be generalized to other reset-and-then-upload procedures, where the reset might be done by one tool and the upload by another).

Another thought could be to not match ports against each other, but the assign ports to a board instance (maybe through the same one or more identifiers mentioned above), and then just use another port with the appropriate protocol belonging to the same board instance.

Recovering when stuck in bootloader Another thing to consider, is how to recover from a failed upload, so when a board is stuck in the bootloader (or was forcibly put into bootloader mode, using e.g. the reset button on a Leonardo or the BOOT0 pin on an STM32). For the Leonardo, the upload happens using serial, so you can just upload normally, but on other cases (DFU, HID) there is no serial port to select and apply the 1200bps touch on. Currently, you can usually start an upload on any other serial port, on which the 1200bps touch fizzles, but since the actual upload then just uses the first available e.g. DFU device, it still works, but it's not quite nice.

However, the above suggestion of making these a two-step upload process, with a separate tool operating on a separate port (e.g. with protocol=dfu) would cause these post-reset bootloader ports to also show up in the normal list of ports to select, so the user can just select a device stuck in bootloader mode and start an upload normally.

Ok, so there's another wall of text... I was hoping to also write down some of my thoughts in example JSON port descriptions and maybe boards.txt/platform.txt snippets, but I ran out of time for now, maybe more later. In the meantime, love to hear some thoughts on all this....

PaulStoffregen commented 3 years ago

The common use case is for platform.txt recipes and board-package specific utilities those recipes run to be the "consumers" of most of the JSON identification fields.

Yes, it is good to think of how many specific cases would be handled. But I do not believe every question needs to be fully answered. In fact, most questions about how specific boards would work don't need to be answered.

Just knowing that Pluggable Discovery provides a flexible mechanism for the designer of a board-package to craft their own discoverer utility and have it work seemlessly with their corresponding upload and "serial" communication utility (for use with forthcoming Pluggable Serial Monitor) should be enough.

matthijskooijman commented 3 years ago

You make a good point that the primary goal of this RFC is probably to define the way discovery tools are defined and communicated with, and they are just a building block to build upon. In that sense, a lot of my remarks are not relevant in this stage, since they are more about the content of these port descriptions and how they are handled by the tooling and/or platform.txt, than about the actual discovery protocol itself.

However, since this proposal does define some things about the content of these port descriptions (the basic structure and some of the content through examples), it does make sense to think about this aspect too.

I do suspect that for what you call the "common use case", cases where the discovery tool and upload tool are platform-specific and matched to each other, the current proposal is likely sufficiently defined (and a big step forward over the current hardcoded approaches). But IMHO it would be a pity if we move forward with this proposal, only to find out that it is insufficiently expressive to handle the somewhat more complex cases. I don't think it's needed to solve every case right now, but exploring some cases in more depth is, in my experience, a good way to discover if a tool is sufficiently flexible and expressive.

Looking back at my previous comments, I think the questions that are most pertinent to the specification of the discovery tool mechanism itself (ignoring the inner content of the protocol, i.e. which prefs are returned and how they are used), are:

The first couple of questions about the stdin/stdout protocol
The question of whether a "board instance" should be a explicit concept or not (since this could influence the JSON format)
The naming of prefs and identificationPrefs (see also below for some additional thoughts).

In a sense, it might be good to further decouple the discovery of ports from the way they are used. A discovery tool should just discover ports and describe them in as much detail as possible, leaving it to the platform/platform.txt to define how to actually handle/match/filter the resulting ports and their info (this is mostly true for builtin discoveries, for platform-specific discoveries, it would be fine to put more logic in the discoveries if that helps).

In this light, it might be good to rename prefs to something like attributes. The name "prefs" refers to "preferences", which makes some sense in the internal Arduino tooling vocabulary that refers to the stuff in e.g. platform.txt/boards.txt as prefs, but in this context, these really are not "prefs" except that they will be made available as prefs to the upload recipe. So in a sense, the name "prefs" comes from the way these attributes are used, not from what they actually are. So, in that sense, I would suggest just calling them attributes, since they are just arbitrary attributes of the port, and then elsewhere specify that all attributes of the port are made available to the upload recipe. It's a small change, but I think this might be good for the understanding and flexibility later.

Then there is the question of identificationPrefs, and I wonder if we really need them at all. Having separate identificationPrefs again makes an assumption about how these attributes are used and lets the discovery tool decide which attributes are used for identification and which are not. If we want to keep the discovery tools generic and simple, maybe the discovery should just return a single list of attributes (i.e. prefs in the current proposal, which seems to always include all identificationPrefs anyway, at least in the current examples) and then put the control over which attributes are used for matching a port to a board with the boards.txt. This is (I now realize) neatly supported by the port_match suggestion I made earlier (point 7 in https://github.com/arduino/tooling-rfcs/pull/2#issuecomment-825785845), e.g. something like:

myboard.port_match.0.vid=0x1234
myboard.port_match.0.pid=0x5678

One remaining case for having a separate identificationPrefs (maybe under a different name) could be the observation that prefs would be attributes of the port, while identificationPrefs would be attributes of the board model. However, I think this is a very thin line, and not really relevant in practice (it's not up to the tooling to make this distinction, but if needed platform.txt/boards.txt can still make this distinction through the properties they select to use in port matching or recipe generation).

PaulStoffregen commented 3 years ago

Several of your questions ask "why". Just to explain, to quite some degree this RFC began as a description of the actual implementation of Pluggable Discovery which already exists in Arduino IDE 1.8.9 - 1.8.13. As far as I know, Teensy and Omzlo may be the only boards using Pluggable Discovery. But do keep in mind Arduino 1.8.9 was released on March 15, 2019, so Pluggable Discovery has been a public API for over 2 years. It's certainly not new. Other boards may already be using Pluggable Discovery.

Of course future Arduino CLI & IDE are not bound by the protocol as it exists today. My understanding of this RFC is to both document and discuss the protocol. If changes are to be made that break backwards compatibility with 1.8.13, hopefully this RFC process and whatever gets implemented in IDE 2.0 can become a long-term stable public API, where future improvements maintain backwards compatibility.

I believe @cmaglie chose to use JSON for discovery utility output but a very simple & ad-hoc plain text command input format because of the expected difference in programming environments. Discovery utilities are anticipated to usually be written in C, since they make use of low-level hardware access APIs. While C libraries do exist for JSON parsing, I believe the expected use case is very simple C programming where stdin is parsed for fixed strings and stdout is generated by printf() statements which happen to print JSON. I can tell you that is the way I wrote Teensy's discovery utility, and this early proof of concept discovery utility.

Your points about the naming are good, especially "prefs". Personally, I do not really care what field names are used. What matters most (at least to me) is the meaning is well documented. And hopefully in the future the names can remain long-term stable, so I don't have to again rewrite Teensy's discovery utility, and Omzlo doesn't have to change theirs, and likewise for everyone who will use this public API.

matthijskooijman commented 3 years ago

Several of your questions ask "why". Just to explain, to quite some degree this RFC began as a description of the actual implementation of Pluggable Discovery which already exists in Arduino IDE 1.8.9 - 1.8.13.

Right, seems I had not realized that, and apparently the wording of this RFC had lead me to believe something different. Thanks for clarifying. This also means that changing details (such as naming) must probably be more carefully considered, because of the required compatibility with the current implementation.

I believe @cmaglie chose to use JSON for discovery utility output but a very simple & ad-hoc plain text command input format because of the expected difference in programming environments. Discovery utilities are anticipated to usually be written in C, (...)

Right, that is indeed reasonable rationale to make the commands simple strings and still use JSON for output. If we'd ever need commands with parameters, then we can always revisit this and encode just the parameters in JSON or so.

cmaglie commented 3 years ago

Hi @matthijskooijman, thanks for the brainstorming, let me try to answer some of your questions:

Regarding the stdin/stdout protocol of discovery tools:

The rationale behind the choice of plaintext/JSON for stdin/stdout respectively has already been well explained by Paul.

For START_SYNC it says "After calling START_SYNC an initial burst of add events may be generated" [...] However, this says "may be generated", shouldn't this be mandatory [...]?

you're right, I'll change "may be generated" to "is generated"

Also, should these intitial add events maybe be marked as such so callers can, if the need to, distinguish between real new ports and pre-existing ports [...] I guess I'm wondering if START_SYNC should maybe start with a list reply rather than a burst of initial add events?

The goal of the discovery is to list the available ports to allow the user to select one and upload to a board (we use it to populate the Tools->Port menu basically), so it doesn't matter if the board has been just plugged in or it was already present. Anyway, there should be no problems for GUIs (we are talking about a handful of ports not thousands!) IMHO it's not worth adding this extra complexity.

About matching port to boards:

Which ports and boards are considered and which prefs are available to match? [...]

The idea is that each discovery provides port metadata (identificationPrefs) that may match any board on any platform (even a different platform from the discovery).

Must all preferences listed in identificationPrefs be matched? Or just the ones common to the port and board?

all identificationPrefs must match, if a platform defines less prefs it did not match.

There is an example of identificationPrefs in the proposal:
myboard.pears.0=20
myboard.apples.0=30
myboard.pears.1=30
myboard.apples.1=40
[...] But will this also match pears=20, apples=40 (i.e. >mixing index 0 and 1)?

how about when some prefs are indexed multiples time and some are not?

they match only when all identificationPrefs provided by the discovery are present in the board properties. If the board declares a subset of the given identificationPrefs it won't match.

When matching ports to boards, must available upload protocols for a board be considered? Or is a port matched to a board regardless of available upload protocols and are those only used for subsequently filtering ports available for upload?

The latter: the board is matched regardless of the protocol.

Would it make sense to make the prefs used to match against identificationPrefs explicit in boards.txt, e.g. using arduino_zero_edbg.port_match.vid.0=0x03eb or so (probably better name than port_match)? This reduces the chances of accidental false positive or false negative matches on prefs that are not actually intended for matching but just happen to have the same name (especially when new identificationPrefs are added to builtin discoveries later). Such a structure could also be used to disambguate the previous two points, e.g. by having something like:
myboard.port_match.0.vid=0x1234 # Match just one particular vidpid
myboard.port_match.0.pid=0x5678
myboard.port_match.1.vid=0x1234 # Match *all* boards with this vid
myboard.port_match.2.vid=0x0000 # And all boars with this vid that also have foo=bar
myboard.port_match.2.foo=bar

that makes sense, but let me think about it for a moment: currently, the board definitions are something like

# Arduino Zero (Native USB Port)
# --------------------------------------
arduino_zero_native.name=Arduino Zero (Native USB Port)
arduino_zero_native.vid.0=0x2341
arduino_zero_native.pid.0=0x804d
arduino_zero_native.vid.1=0x2341
arduino_zero_native.pid.1=0x004d

that should be changed to something like:

# Arduino Zero (Native USB Port)
# --------------------------------------
arduino_zero_native.name=Arduino Zero (Native USB Port)
arduino_zero_native.port_match.0.vid=0x2341
arduino_zero_native.port_match.0.pid=0x804d
arduino_zero_native.port_match.1.vid=0x2341
arduino_zero_native.port_match.1.pid=0x004d

but let's consider also that it's highly desirable to keep backward compatibility too, so we should keep both definitions:

# Arduino Zero (Native USB Port)
# --------------------------------------
arduino_zero_native.name=Arduino Zero (Native USB Port)
# for old IDE
arduino_zero_native.vid.0=0x2341
arduino_zero_native.pid.0=0x804d
arduino_zero_native.vid.1=0x2341
arduino_zero_native.pid.1=0x004d
# for Pluggable Discovery
arduino_zero_native.port_match.0.vid=0x2341
arduino_zero_native.port_match.0.pid=0x804d
arduino_zero_native.port_match.1.vid=0x2341
arduino_zero_native.port_match.1.pid=0x004d

and possibly we should automatically convert "old-style" vid/pid definitions into the new boardname.port_match.* so old platforms will continue to work on the pluggable discovery. BTW I like this approach, it makes the port identification much more flexible, so I'll probably adjust the RFC to add this prefix.

Ok for now I stop here, @matthijskooijman I've quickly read the remaining wall of text (damn you! :-D), but it requires a more detailed answer, tomorrow I'll add some more comments.

PaulStoffregen commented 3 years ago

Backward compatibility is desirable. But if a breaking change is good for the long-term Arduino ecosystem, I believe changes should be considered.

cmaglie commented 3 years ago

ok, @matthijskooijman let me answer some more questions and after that, I'll try to explain a bit more in depth the RFC...

Can a port implement multiple protocols?

No, every port has only one protocol.

[prefs field is not clear maybe renaming to...] uploadPrefs might slightly better reflect this?

Yes, that may be a better alternative, but thinking on this a bit more, we are not talking about preferences but properties, so I'm going to rename the fields as follows:

prefs -> portProperties
identificationPrefs -> boardIdentificationProperties

In the constraints section, it says "Each port may provide metadata to identify a specific instance of the board (serial number / MAC address)", which I think refers to the prefs attribute? This initially made me thought that the prefs attribute is actually exclusively info on the board instance (implying that two different ports belonging to the same board instance would have identical prefs), but it seems that prefs is really metatadata on a specific instance of the port (which is somewhat equivalent to a specific instance of a board for single-port boards). So maybe this should be rephrased somehow to clarify this.

Agreed, I'll rephrase it

Maybe some explicit mention should be made of what constitutes the identity of a port. In particular, I think that address now constitutes the identifier of each port and should be unique (re-reading, this is already mentioned in the constraints section). Or is it protocol + port (this ties into the multiple protocols per port question, I guess)?

In theory, the user should be able to select a port by specifying the pair (protocol, address), so each port should have a unique identifier made by the pair (protocol, address) and this should be true across all ports of all discoveries. BTW this is not guaranteed by the specification because it requires some sort of agreement between all the discoveries (and consequently between all the developers... this is not going to happen).

This is also relevant for correlating add and remove events. And should such identify/address be globally unique, or only within a given discovery tool? How to deal with two discoveries returning the same address?

The remove event has only the address field and this is wrong, it should report both address and protocol, I'll fix this in the specification 👍🏼 good catch. If multiple discoveries report the same pair (protocol, address) the discovery included in the board's package gets priority (thanks to @PaulStoffregen suggested this change). We may still have a tie, in that case we should let the user decide, I see no other solutions.

Do we need a way to correlate multiple ports belonging to the same board instance?

This is a "nice to have" I've thought in the early draft, I liked the idea that the GUI may show the ports belonging to the same board nicely grouped together. BTW I dropped this idea because there are some challenges here:

we must find a "cross-protocol" boardSerialNumber, something like a CPUID
once we have the boardSerialNumber we must find a way to add its value as extra metadata in the payload of each protocol, for example, let's consider the USBSerial discovery: how the boardSerialNumber is supposed to reach the discovery? should it be added as a payload to the USB descriptor? is this even possible? The same applies for all the other protocols but, unless they have a field planned for this specific purpose, this looks quite hacky. BTW if there is interest we may also add a field boardSerialNumber in a later revision of the RFC (adding is always possible without much risk)

Platforms can define custom discovery tools, but when should these run? When any board from such platform is selected? Or should tools be connected to specific boards? When doing a generic port list (e.g. using arduino-cli or so), I guess all tools from all platforms should be ran?

All discovery from all installed platforms are run simultaneously. If you run arduino-cli board list the LIST command is used to get a one-shot list of all ports. When running in daemon mode instead the START_SYNC method is used so the IDE does not need to poll.

Should we make more details available for each port? I.e. for a USB-serial port, in addition to the vid-pid, you could more specifically trace the port to a USB configuration index, interface index, maybe even endpoint index. [...] I guess all of these identifiers are just additional info about a particular port and could be put into prefs, which also allows using them in recipes. It would be good if we could have some guidance about such properties, so different discovery tools would use the same names and same formats for these things were possible.

Exactly, you can add as many metadata as you want in the prefs section. About the "guidance" we may provide some guidelines, but I'm not very optimist on the outcome, if the pluggable discovery will gain popularity we will probably see all kind of variants popping up, regardless of the guidelines. Also who is going to provide these guidelines?

Historically, there was a serial.port=ttyACM0 and serial.port.file=/dev/ttyACM0, but the examples suggest that now only /dev/ttyACM0 is returned by the discovery tool. Should the short form also be returned by the discovery tool for upload tools that need it (and do not want to rely on backward compatibility variables)? I guess this is just a very specific instance of the previous point.

There is a specific paragraph about that: For backward compatibility we will keep a copy of the address also in {serial.port} and in the specific case of a protocol=serial we will populate also {serial.port.file}.

unfortunately this can not be resolved in the discovery, even adding the property serial.port.file=ttyACM0 inside the prefs would lead to the {upload.serial.port.file} property to be provided in the upload recipe, but we want {serial.port.file}

In the examples a protocol network is mentioned, but I wonder if that is not way too generic? To really return a meaningful list of network devices, some selection based on e.g. available services is needed?

network is really too generic, probably I should change it to mdns? In any case the meaningful part is that the protocol field is used to select the upload recipe. Let's look the example from the RFC, before the pluggable discovery we have:

[in boards.txt:]
myboard.name=MyBoard
myboard.upload.tool=bossac

[in platform.txt]
tools.bossac.upload.pattern=.......

bossac is the recipe-identifier that tells which recipe from platform.txt we must use for the upload, in general the myboard.upload.tool=RECIPEID is a pointer to the tools.RECIPEID.upload.pattern in platform.txt. With the pluggable discovery we must change it this way:

[in boards.txt:]
myboard.name=MyBoard
# Upload recipes
myboard.upload.tool.serial=bossac
myboard.upload.tool.network=arduino_ota

[in platform.txt]
tools.bossac.upload.pattern=...(this one is used for serial protocol)...

tools.arduino_ota.upload.pattern=...(this one is used for network protocol)...

this means that each board can have his own particular way to upload via network for example:

[in boards.txt:]
linuxboard.name=A board running linux
linuxboard.upload.tool.network=scp

anotherboard.name=A board with a microcontroller + wifi module
anotherboard.upload.tool.network=arduino_ota

even if the discovery is the same we may run different recipes for upload, for example the linuxboard run scp to upload the binary while anotherboard run an ota client that transfers the binary via TCP. The discovery should provide just the port address+protocol and some meaningful properties, it's a duty of the platform to decide how to use them.

How does all this relate to upload using programmer? In a way, I guess a programmer could be treated as a board by itself, with its own prefs (e.g. vid/pid) matching against available ports, etc. but I haven't thought this through.

The programmer works exactly as before: the platform provides a programmers.txt file with a list of definitions that overrides part of the boards.txt definitions (in particular the upload.tool.* which allows to use a different recipe for uploading).

matthijskooijman commented 3 years ago

@cmaglie, Thanks for your extensive followup, here's some more responses to that (no new things, I promise :-p)

6. how about when some prefs are indexed multiples time and some are not?

You misunderstood what I meant by this. I meant, what if I have:

myboard.pears.0=20
myboard.pears.1=30
myboard.apples=40

Will this match e.g. pears=20, apples=40 (or any other combination involving pears and apples). I suspect the answer is "no", which is fine, but that might need to be specified somewhere (maybe not even here, though).

7. and possibly we should automatically convert "old-style" vid/pid definitions into the new boardname.port_match.* so old platforms will continue to work on the pluggable discovery.

Yeah, I suspect that currently it's almost exclusively the vid/pid values that are used, so autoconverting just those probably makes sense (provided there is no port_match value at all, if there is, of course no autoconverting anymore). I guess that if pluggable discovery is already in use, there could be other values than vid/pid in use currently, but I have no idea if there are.

10. Yes, that may be a better alternative, but thinking on this a bit more, we are not talking about preferences but properties, so I'm going to rename the fields as follows:

prefs -> portProperties

identificationPrefs -> boardIdentificationProperties

Right, sound good. Note that I made a similar suggestion to just use attributes in a later https://github.com/arduino/tooling-rfcs/pull/2#issuecomment-826057693, so seems we're of the same mind there. Note that in that same comment I also wondered if we need identificationPrefs / boardIdentificationProperties at all, though.

port identification In theory, the user should be able to select a port by specifying the pair (protocol, address), so each port should have a unique identifier made by the pair (protocol, address) and this should be true across all ports of all discoveries.

So this means identity involves address? Does that mean that a discovery can return the same address twice, but with different protocols? I think not, currently, unless the delete event necessarily deletes both ports with the same address.

and this should be true across all ports of all discoveries. BTW this is not guaranteed by the specification because it requires some sort of agreement between all the discoveries (and consequently between all the developers... this is not going to happen).

Yeah, this is why I thought of only applying discoveries that are defined by the board's platform (plus builtins, which are then implicitly or explicitly included by the platform). That would solve at least some of the overlap issues between different platforms that use different address and property conventions?

Do we need a way to correlate multiple ports belonging to the same board instance? This is a "nice to have" I've thought in the early draft, I liked the idea that the GUI may show the ports belonging to the same board nicely grouped together.

Agreed that this is probably more trouble than it's worth. However, I do think there's two more things were port grouping by board might be needed/helpful:

To find ports after a (e.g. 1200bps) reset that re-enumerates ports (see my later comment above). However, as suggested there, you might actually want to define a way to match the post-reset port from the pre-upload port, rather than just assuming that any port connected to the same board is ok (which it might not be if a board has multiple ports).
To find a serial monitor port corresponding to the board you are uploading to, in case that the upload port and monitor port are different. But maybe then boards.txt could, in a similar way to the 1200bps reset thing, somehow provide rules to find the serial monitor port based on the upload port properties, which is probably more powerful than grouping by board and making assumptions.

more port metadata Exactly, you can add as many metadata as you want in the prefs section. About the "guidance" we may provide some guidelines, but I'm not very optimist on the outcome, if the pluggable discovery will gain popularity we will probably see all kind of variants popping up, regardless of the guidelines. Also who is going to provide these guidelines?

Well, one starting point would be to write up a few properties to be returned by the builtin discovery tools (I'm assuming there will be a few builtins, at least serial, maybe DFU, maybe a generic USB one using vid/pid matching or even USB descriptor matching), with what they are expected to contain, in order to ensure that some common properties (most obvious and widespread are probably a few USB properties) are the same across builtin tools, which could also be a recommendation for third party tools. Anyone is free to pick their own convention, of course, but I suspect that third-party tools mimicking the builtin ones is convenient for those third parties as well.

/dev/ttyACM0 vs ttyACM0 There is a specific paragraph about that:

For backward compatibility we will keep a copy of the address also in {serial.port} and in the specific case of a protocol=serial we will populate also {serial.port.file}.

Yes, but I wasn't talking about compatibility with existing recipes, I was asking: How should a modern recipe (not intended to rely on this backward compatible {serial.port.file}) access the ttyACM0 version. Most likely this is something for the serial discovery tool, which should return name: ttyAMC0 or so in the prefs/uploadProperties, so not something to change in this RFC (except maybe in the examples).

network too generic? even if the discovery is the same we may run different recipes for upload, for example the linuxboard run scp to upload the binary while anotherboard run an ota client that transfers the binary via TCP.

Yup, this is true, but this does rely on a generic network discovery tool to provide sufficient identification properties so each board can match against only network ports that it can actually use. OTOH, if this is not the case, a platform can always provide a custom discovery tool that discovers network hosts using additional filtering (based on port probing, mdns service records, UPNP, etc.), give that a new protocol name and use that. So, I guess a generic network name is actually appropriate, assuming that it just tries to discover as much network hosts on the local network as possible, without any filtering (but with providing as much info as possible, of course).

The discovery should provide just the port address+protocol and some meaningful properties, it's a duty of the platform to decide how to use them.

Yup.

upload using programmer The programmer works exactly as before: the platform provides a programmers.txt file with a list of definitions that overrides part of the boards.txt definitions (in particular the upload.tool.* which allows to use a different recipe for uploading).

Yeah, but how would port matching and selection work for programmers? Currently, the selected board serial port also selects the programmer port to use, for serial programmers, while non-serial-based (e.g. native USB) programmers typically ignore the port selection and just select the first available device based on e.g. vidpid matching internal to the tool (which can be problematic when you have multiple programmers attached, which is not so unlikely when using boards like the STM32 Nucleo/discovery, which have an ST-link programmer integrated, or the zero which has a CMSIS-DAP programmer integrated IIRC). It would be nice if this generalization of the port concept could also be applied to programmers (which would then need to define their own port_match properties, I guess?).

cmaglie commented 3 years ago

Will this match e.g. pears=20, apples=40 (or any other combination involving pears and apples). I suspect the answer is "no", which is fine, but that might need to be specified somewhere (maybe not even here, though).

Ahh, of course not, it will not match. I'll try to specify better in the RFC

Note that in that same comment I also wondered if we need identificationPrefs / boardIdentificationProperties at all, though.

Right, once the identification properties are clearly prefixed in the boards.txt definition, the only reason to keep identificationPrefs is that these properties refer to a board while the prefs refers to a port, but as you said, this is a really thin line, and thinking about this a bit more it's probably better to leave this choice to the platform. So I'll implement this change in the RFC (removing the identificationPrefs).

Does that mean that a discovery can return the same address twice, but with different protocols? I think not, currently, unless the remove event necessarily deletes both ports with the same address.

instead yes! (like, just to say, a network host that is called exactly COM1... very unlikely to happen but still possible) As I said the remove event must report the protocol and the address, this is an error in the RFC that I must fix.

this is why I thought of only applying discoveries that are defined by the board's platform (plus builtins, which are then implicitly or explicitly included by the platform). That would solve at least some of the overlap issues between different platforms that use different address and property conventions?

So to recap your suggestion is:

the builtin discoveries (serial and network) are shared between all platforms by default
a platform must declare the discoveries used for port discovery
a platform may refer to another discovery from another platform

in particular, I like 3 which should reduce the urge for implementors to create their own discovery and push for reuse/contribute to the existing ones.

That works for me, I'll try to adapt the specification for that.

Well, one starting point would be to write up a few properties to be returned by the builtin discovery tools (I'm assuming there will be a few builtins, at least serial, maybe DFU, maybe a generic USB one using vid/pid matching or even USB descriptor matching), with what they are expected to contain, in order to ensure that some common properties (most obvious and widespread are probably a few USB properties) are the same across builtin tools, which could also be a recommendation for third party tools. Anyone is free to pick their own convention, of course, but I suspect that third-party tools mimicking the builtin ones is convenient for those third parties as well.

For now, the only discovery we will provide is serial and network. Having a good starting implementation for the most common discoveries right from the start will surely help to make the ecosystem more coherent in the long run. But time is a factor... writing a good-quality cross-platform discovery is quite hard, especially when running in START_SYNC mode, for DFU and generic USB we will probably need a big help from the community to make it happen.

Yes, but I wasn't talking about compatibility with existing recipes, I was asking: How should a modern recipe (not intended to rely on this backward compatible {serial.port.file}) access the ttyACM0 version. Most likely this is something for the serial discovery tool, which should return name: ttyAMC0 or so in the prefs/uploadProperties, so not something to change in this RFC (except maybe in the examples).

If you tell me I would say: you must fix it in the upload tool (to make it accept /dev/ttyACM0 instead of ttyACM0), but yes, adding another property in the prefs set is a no-brainer and solve a lot of problems with minimal effort. So 👍🏼 I'm going to add it to the discovery and change the example in the RFC.

Yeah, but how would port matching and selection work for programmers? Currently, the selected board serial port also selects the programmer port to use, for serial programmers, while non-serial-based (e.g. native USB) programmers typically ignore the port selection and just select the first available device based on e.g. vidpid matching internal to the tool (which can be problematic when you have multiple programmers attached, which is not so unlikely when using boards like the STM32 Nucleo/discovery, which have an ST-link programmer integrated, or the zero which has a CMSIS-DAP programmer integrated IIRC). It would be nice if this generalization of the port concept could also be applied to programmers (which would then need to define their own port_match properties, I guess?).

mmm not sure I get this one. At the moment the port selection is independent of the action you do later. Just to clarify, the "action" can be:

upload
upload with programmer
burn bootloader

what changes between these actions is the recipe that is run to perform the action itself, to be precise, the recipe used are respectively pointed by:

board.upload.tool=RECIPEID
board.program.tool=RECIPEID
board.bootloader.tool=RECIPEID

these definitions may be overridden by selecting a programmer from the programmer.txt so, for example, if you need a different command line for the ST-Link you can write:

[in programmers.txt]
stlink.name=ST-Link
stlink.program.tool=openocd-stlink

For extra clarity let me try to write an example that combines everything together:

[in boards.txt]
board1.name=Board One
board1.upload.tool.serial=bossac
board1.upload.tool.network=arduino_ota
board1.bootloader.tool.serial=openocd
board1.program.tool.serial=openocd
board1.program.tool.dfu=openocd-dfu

[in platform.txt]
tools.bossac.upload.pattern=...(serial upload)...

tools.arduino_ota.upload.pattern=...(network upload)...

tools.openocd.program.pattern=...(serial upload with programmer)...

tools.openocd.bootloader.pattern=...(serial burn bootloader)...

tools.openocd-dfu.program.pattern=...(dfu upload with programmer)...

anyway in each recipe you will always have the port properties available under the prefix upload.* (so {upload.address}, {upload.port.vid}, etc.). So it seems to me that there is enough expressivity to handle all the cases, but maybe not? could you give an example that may not fit here?

BTW great feedback 👍🏼 you're basically stress testing the specification that is really a good thing.

I'll go ahead and amend the RFC with the suggestion above.

PaulStoffregen commented 3 years ago

About the communication from IDE / CLI to the discovery utility, could we add a required (sent by the IDE) initial message to inform the discovery utility which software & version is used? And if that initial ID message is not transmitted, the discovery utility would assume it is being used by Arduino IDE 1.8.13. If the message is present but unparseable or an unknown version, the discovery utility would assume this RFC.

As an implementer of a 3rd party discovery utility, of course my main wish is for the protocol to be perfectly stable and forever backwards compatible. But if reality does not turn out to be so kind, knowing which IDE and which version will be receiving my JSON output would let me transmit "prefs" when used with IDE 1.8.13 and "portProperties" when working with 1.8.14 or 2.0, and perhaps tailor the JSON as needed when the protocol is modified in the future, or as future IDEs might use of the same protocol in slightly different ways.

matthijskooijman commented 3 years ago

Regarding the port_match proposal, I would suggeste replacing port_match (which I did not really like much anyway), maybe use upload_port? e.g. myboard.upload_port.0.vid=0x1234? This makes a bit more explicit what port is matched, and also (in the future) allows adding e.g. myboard.monitor_port.0.vid=0x1234 to apply the same matching algorithm to matching the serial monitor port, in case the upload port and monitor port are different.

instead yes! (like, just to say, a network host that is called exactly COM1... very unlikely to happen but still possible) As I said the remove event must report the protocol and the address, this is an error in the RFC that I must fix.

Right, good motivating example. Agreed.

the builtin discoveries (serial and network) are shared between all platforms by default

If we add a way to include discoveries from other platforms, it might even be nice to also apply this to builtin discoveries, so a platform must explicitly include the builtin discoveries it needs. This prevents "polluting" the port namespace with discoveries that are not needed at all (i.e. AVR has no need for network, I think), and also makes it easier to (maybe temporarily) replace a builtin discovery (i.e. using a customized version of the serial discovery that adds extra properties) without having to use a different upload protocol. There is the question of backward compatibility, though, how to distinguish between new platforms that need no builtin discoveries and old platforms that rely on the current behavior of implicitly making serial (and I guess also network?) ports available? Maybe this can be done based on whether boards use the upload.tool or upload.PROTOCOL.tool syntax? (this is where platform.txt-spec-versioning suggested in https://github.com/arduino/arduino-cli/issues/985#issuecomment-694205169 would have been helpful)

If you tell me I would say: you must fix it in the upload tool (to make it accept /dev/ttyACM0 instead of ttyACM0),

Yeah, fixing this in the tool would be obvious, but when using third-party tools, this is not always easy (and adding a wrapper script is always possible, but extra effort because of cross-platform).

but yes, adding another property in the prefs set is a no-brainer and solve a lot of problems with minimal effort. So 👍🏼 I'm going to add it to the discovery and change the example in the RFC.

Tnx

for DFU and generic USB we will probably need a big help from the community to make it happen.

Yeah, I feel the limited time thing all too well. I would be motivated to give these a stab, except that I really shouldn't be taking on new big projects until I finish a lot of my existing projects first... But just serial and network would indeed be a good start, then maybe third-party cores could implement others, which could be imported as builtins at some point later, when they're proven.

mmm not sure I get this one. At the moment the port selection is independent of the action you do later.

I think this might actually be a problem, since this approach was chosen when regular uploads and programmer uploads could only use serial ports, so sharing the same selection was pragmatic (not conceptually correct, and also impractical in same cases, i.e. https://github.com/arduino/Arduino/issues/5554). However, when expanding beyond serial ports, this becomes problematic.

Consider you have two Unos attached, and each has an external USB-ASP programmer attached. So, you select the "Uno" board to program, which offers two ports for uploading: The ACM serial ports of both Unos. However, when I then select "Upload using programmer", neither of these ports is applicable, instead I should be offered both USB-ASP devices.

This is currently "solved" by letting programmers like the USB-ASP simply ignore the selected (serial) port and just pick the first USB-ASP device they see (based on vid/pid filtering internal to avrdude), but this is a hack at best, and something that would be solvable within the context of this discussion.

What I think would be needed, is that:

The port selection is specific to the action you intend to do later (regular upload or upload using programmer).
For regular upload, the port matching/filtering happens based on the selected board, for programmer upload, it happens based on the selected programmer.
Programmers in programmers.txt can specify a port_match just like boards, to select the ports that are valid for this programmer.

Also, you give this example:

[in boards.txt]
board1.name=Board One
board1.upload.tool.serial=bossac
board1.upload.tool.network=arduino_ota
board1.bootloader.tool.serial=openocd
board1.program.tool.serial=openocd
board1.program.tool.dfu=openocd-dfu

AFAICS, the board1.program.tool.xxx lines are an addition, right, these are not currently used? I wonder if those make sense, though, since this now seems to require that you list available programmer upload tools for each board separately? Looking at the AVR core, it defines 15 different programmers already, each of which define their own tool (all 15 use avrdude, but these could be many different ones as well). So I'm unsure how these lines in your examples are intended.

One additional thing to note: There is currently not any way to specify which programmers are applicable to which boards. Originally, all programmers in all platforms where made available for all boards, including completely incompatible combinations. Since https://github.com/arduino/Arduino/pull/9900, only programmers from the same platform as the selected board (and maybe a referenced platform). In practice this works well enough, since typically all boards in a single platform use the same programmer technology (i.e. all ISP programmers work on all AVR boards). It might be useful to make this more explicit, i.e. for platforms where multiple programmer protocols exist and not all boards support all protocols, but maybe this can be left undefined (and you just rely on the user to know which programmer they have and for which boards it is suitable).

BTW great feedback 👍🏼 you're basically stress testing the specification that is really a good thing.

Thanks, that helps motivate me to free up some time for this (which I consider an import step in making the Arduino ecosystem a bit more generic in terms of supported devices).

@PaulStoffregen said:

About the communication from IDE / CLI to the discovery utility, could we add a required (sent by the IDE) initial message to inform the discovery utility which software & version is used?

I agree that this would be useful, this is something that should be part of any protocol, I think (weird that I did not realize this in my initial review :-p). However, I wonder if the software and version itself is sufficient, or whether a protocol version should be included instead? Checks for software versions can be ok, but are not really scalable. Consider for example the -DARDUINO used in compilation now, which specifies the IDE version, but is essentially meaningless when using arduino-cli, so then you end up using a "Compatible with version"-type of thing. This is partly mitigated by specifying the software and version, but even then, version checks in tools become outdated when new tools are introduced.

In that light, I would suggest adding a protocol version (probably in addition to software+version, which can also be useful to account for specific quirks). The protocol that is currently in use could be v1, which is assumed when no protocol is advertised, the RFC under discussion could then be v2.

PaulStoffregen commented 3 years ago

A protocol version number is a good idea for the future.

As the maintainer of a board package, I also want info about which software and which version, similar to the User-Agent info web browsers transmit. The anticipated use case is crafting a workaround for issues in specific versions of the IDE. It is messy, much like how many websites have special code for Internet Explorer. I hope to never need this. But if any particular IDE version needs a special workaround, I will go to great lengths to provide a smooth experience to my users. Alternative ways to deduce the IDE version are much worse.

cmaglie commented 3 years ago

I've updated the PR and applied most of the suggestions in this thread, please give it a look, and if there are comments or changes you can use the in-line comment function so it's easier to track down.

About the remaining concerns:

Consider you have two Unos attached, and each has an external USB-ASP programmer attached. So, you select the "Uno" board to program, which offers two ports for uploading: The ACM serial ports of both Unos. However, when I then select "Upload using programmer", neither of these ports is applicable, instead I should be offered both USB-ASP devices.

Actually, the board selection is independent of the port selection. If you select the "Arduino Uno" board you may select the ports that are detected as Uno board (so the two ACMs) or you can always "force" the selection to one of the two ports of the USB-ASP devices.

This is currently "solved" by letting programmers like the USB-ASP simply ignore the selected (serial) port and just pick the first USB-ASP device they see (based on vid/pid filtering internal to avrdude), but this is a hack at best, and something that would be solvable within the context of this discussion.

Nope, currently, there are programmers that require a port and there is no problem with that. Actually, we had to fix the opposite problem: you were forced to select a port even if not used!

What I think would be needed, is that:

The port selection is specific to the action you intend to do later (regular upload or upload using programmer).

For regular upload, the port matching/filtering happens based on the selected board, for programmer upload, it happens based on the selected programmer.

As I said you're not constrained in any way on port selection and no filtering on ports is happening.

Programmers in programmers.txt can specify a port_match just like boards, to select the ports that are valid for this programmer.

This one instead is nice, adding a port_match to programmers is cool we can use it to detect programmers too. 👍🏼

Also, you give this example:
[in boards.txt]
board1.name=Board One 
board1.upload.tool.serial=bossac
board1.upload.tool.network=arduino_ota
board1.bootloader.tool.serial=openocd
board1.program.tool.serial=openocd
board1.program.tool.dfu=openocd-dfu
AFAICS, the board1.program.tool.xxx lines are an addition, right, these are not currently used? I wonder if those make sense, though, since this now seems to require that you list available programmer upload tools for each board separately? Looking at the AVR core, it defines 15 different programmers already, each of which define their own tool (all 15 use avrdude, but these could be many different ones as well). So I'm unsure how these lines in your examples are intended.

Ok let me simplify it, consider this:

[boards.txt]
board1.name=Board One 
board1.upload.tool.serial=bossac
board1.upload.tool.network=arduino_ota

[programmers.txt]
prog1.name=Prog1
prog1.bootloader.tool.serial=openocd
prog1.program.tool.serial=openocd

prog2.name=A-DFU Programmer
prog2.program.tool.dfu=openocd-dfu

in this case, when you select, say, "Prog1" the two directives:

bootloader.tool.serial=openocd
program.tool.serial=openocd

are added to the board (whatever the board you selected!) when you do "Upload with programmer".

So to recap, if you select "Board One" the full configuration used during a normal "Upload" is:

board1.name=Board One 
board1.upload.tool.serial=bossac
board1.upload.tool.network=arduino_ota

and board1.upload.tool.PROTOCOL is used for upload.

If you select the "Board One" board, the "Prog1" programmer, and you do "Upload with programmer" then the config used is:

board1.name=Board One 
board1.upload.tool.serial=bossac
board1.upload.tool.network=arduino_ota
board1.bootloader.tool.serial=openocd       <---- Those are inerithed from Prog1
board1.program.tool.serial=openocd          <---- Those are inerithed from Prog1

and the program.tool.PROTOCOL recipe is used, the rest is just leftover from the rest configuration that is ignored.

Hope this covers all your doubt :-)

Instead, while thinking about the above, there is another case that comes to my mind: there are boards that do not require port selection! I think that in this case, we may reserve a dummy protocol name, for example no-port, to use for this specific purpose, so a board can define a rule like:

upload.tool.no-port=openocd-autodetect

and the same goes for programmers.

One additional thing to note: There is currently not any way to specify which programmers are applicable to which boards. Originally, all programmers in all platforms where made available for all boards, including completely incompatible combinations. Since arduino/Arduino#9900, only programmers from the same platform as the selected board (and maybe a referenced platform). In practice this works well enough, since typically all boards in a single platform use the same programmer technology (i.e. all ISP programmers work on all AVR boards). It might be useful to make this more explicit, i.e. for platforms where multiple programmer protocols exist and not all boards support all protocols, but maybe this can be left undefined (and you just rely on the user to know which programmer they have and for which boards it is suitable).

This would be nice to have, but is out of the scope of this RFC and, IMHO, the benefit to implement this particular feature would not be worth the effort.

matthijskooijman commented 3 years ago

I've updated the PR and applied most of the suggestions in this thread, please give it a look, and if there are comments or changes you can use the in-line comment function so it's easier to track down.

Thanks, willdo.

Actually, the board selection is independent of the port selection. If you select the "Arduino Uno" board you may select the ports that are detected as Uno board (so the two ACMs) or you can always "force" the selection to one of the two ports of the USB-ASP devices.

Hm, this might actually be problematic. Consider USB-only devices, such as the USB-ASP programmer, which does not expose a serial port. As I said, currently, these just ignore the port selection entirely and just use the first connected one, which can be problematic. To allow selecting between multiple of these connected programmers, we would need some kind of "USB" discovery that lists USB devices. However, when implemented in the obvious way, this would produce a list of all USB devices connected to the system, flooding the port selection menu with tens of ports, which is almost certainly going to confuse people.

So it seems that this needs some kind of filtering. But it does not seem reasonable for the USB discovery tool to know which devices are (not) relevant. Maybe this could be passed as commandline arguments, but that would then still duplicate information (i.e. you would have the vidpid in both the programmers.txt USB-ASP definition and in the platform.txt discovery tool pattern).

I can imagine some ways to fix this:

When defining a discovery tool, add a require_port_match=true or so, to discard any ports that are not matched by the current board/programmer.
Letting the discovery tool set require_port_match=true in the returned JSON. This allows per-port definition of this setting, but also puts more smarts into the discovery tool that you might not want to put there.
When showing a list of ports, make a more clear distinction between "matched" (aka recommended) and "other" ports. For this, I can imagine two types of "recommended" ports: when the board/programmer specifies upload_port entries, the ports that match, if no upload_port entries are given, recommend all ports that have a protocol for which the current board/programmer has a tool defined. These recommended ports can be shown directly in the ports menu, with the rest of the ports shown in a "other ports" submenu or so. Note that boards without upload_port entries would be typical ESP boards that use a generic vidpid (or the pro mini that uses an external USB-to-serial adapter), so they cannot match the first category, but should still only offer serial ports, not all ports in the first step, I think.

Of these, option 1. and 2. can be combined, and maybe a combination of all three could also work (where 1./2. do not cause ports to be ignored entirely, but demote them to the "other ports" submenu).

Note that I talk about programmers like the USB-ASP above, but the same could apply to regular bootloader uploads too. Though I can't think of a specific example off-hand, since everything that comes close always has a serial port anyway (Leonardo) even when it does not use it for the upload itself (Zero/EDBG), or probably needs a specific discovery tool (DFU). Maybe STM32 with HID bootloader after a forced reset-to-bootloader (which I think does not expose a serial port), just a HID device.

Writing this, I realized two more things:

There is currently the problem that e.g. ESP devices cannot specify a vidpid, because there would be a lot of them with the same (generic FTDI) vidpid, and then a generic USB-to-serial adapter would be incorrectly identified as an ESP. This makes me realize that this port matching potentially serves two purposes: 1) To identify the board attached to a given port, and 2) decide whether a given port could be valid for upload. For most boards, the port matching is usable for both purposes, but for some boards like ESP with generic FTDI vidpid, only purpose 2 applies. So I wonder, is this something that we should make explicit, by e.g. saying esp8266.upload_port.0.generic=true? (maybe using a better/more explicit name than generic)? Even if we would only use the matching for purpose 1 at first (like the IDE does now), this allows expressing both purposes and allows extending our tools later for purpose 2 as well.
Option 3 above (splitting into "recommended" and "other" ports) probably requires splitting the "board port" and "programmer port" selection in the IDE. This currently just uses the same selection, which sort of worked because only serial ports were offered originally (but is already annoying having to switch ports when using both programmer upload on one port and serial monitor on another port). But with option 3 above, imagine that e.g. the STK500 programmer is selected from a previous session (which is a generic serial programmer, so no upload_port entries. This would mean that all serial ports are now "recommended", potentially confusing a user that just wants to do a bootloader upload and does not realize that the programmer selected also influences the ports listed.

Hope this covers all your doubt :-)

Yup, this clarifies what you meant, and looks good.

Instead, while thinking about the above, there is another case that comes to my mind: there are boards that do not require port selection! I think that in this case, we may reserve a dummy protocol name, for example no-port, to use for this specific purpose, so a board can define a rule like:

I suspect that in most cases, such boards could be defined to allow proper port selection (provided the upload tool allows identifiying the target in some way), but this might require custom discovery tools to provide the right port listing. I also suspect that in a lot of cases, going through the hoops to make port selection work for those boards is not worth it, so having an explicit no-port protocol seems like a great idea (even if just for interim while a proper discovery setup is being built for a particular board/upload method).

I wonder if no-port is the right name here, though. It covers what you want to express, but it's a bit a weird in the list of "protocols". I don't have a better suggestion right away, though. In any case, maybe it should be no_port, since other entries also use _ rather than -?

This would be nice to have, but is out of the scope of this RFC and, IMHO, the benefit to implement this particular feature would not be worth the effort.

Agreed on both points.

PaulStoffregen commented 3 years ago

Regarding the variables which may be used in upload recipes, currently we have this:

The selected port address will be provided in the variable {upload.address}. Other metadata provided by the discover in the prefs section will be provided in the {upload.port.*} variables.

This needs a couple updates. "prefs" was renamed "properties". While this text needs the new name, the sample JSON has "properties". But it also has "identificationPrefs". Wasn't that removed?

Strictly according to this spec, the only variables would be "{upload.address}" and several "{upload.port.*}" from the properties list. But the examples show "{upload.protocol}". The spec should be updated to say "{upload.protocol}" is defined for recipes to use.

I want to request "{upload.label}". The use case is for the upload program to be able to show status with the same name as seen in the Ports menu. Status is usually given by stdout/stderr which the Arduino IDE prints to the console panel. An upload tool might also implement a GUI or logging or other ways to report activity to the user. Especially when troubleshooting connectivity problems, even a slight discrepancy in the port name shown to the end user can lead to confusion and misdirected effort.

Please also add "{serial.port.label}" and "{serial.port.protocol}" in the backwards compatibility. Teensy's platform.txt upload recipe uses these from Arduino IDE 1.8.13.

PaulStoffregen commented 3 years ago

Also on backwards compatibility, it would be nice if the spec allowed discovery utilities to transmit the old "prefs" and "identificationPrefs". Perhaps it should require that future IDEs are to silently ignore those?

PaulStoffregen commented 3 years ago

However, I wonder if upload.label is the right name, might be a bit too generic, since it is the port label, not the upload label.

If another naming is to be used, then please consider my request for the {upload.label} variable usable in recipes to be included in whatever naming scheme is chosen. The use case is what matters - an upload tool wants to have access to the same port name as shown to the end user in the IDE, so it can show status without any confusing discrepancy in the port name.

or maybe someone else has a nicer way to fix this?

An alternative might be to exactly replicate all the JSON files under "upload.port" (or whatever name is chosen).

For example, if the discovery utility transmits this JSON:

{
  "eventType": "add",
  "port": {
    "address": "/dev/ttyACM0",
    "label": "ttyACM0",
    "protocol": "serial",
    "protocolLabel": "Serial Port (USB)",
    "properties": {
      "pid": "0x804e",
      "vid": "0x2341",
      "rev": "0x0104",
      "interface": "1",
      "serialNumber": "EBEABFD6514D32364E202020FF10181E"
    },
    "identificationPrefs": {
      "pid": "0x804e",
      "vid": "0x2341"
    }
  }
}

Everything appearing in the JSON output, even if it is unknown or meaningless to the IDE or CLI (in this case, the old identificationPrefs), could become a variable usable in the upload recipe. For the JSON above:

{upload.port.address} = /dev/ttyACM0
{upload.port.label} = ttyACM0
{upload.port.protocol} = serial
{upload.port.protocolLabel} = Serial Port (USB)
{upload.port.properties.pid} = 0x8043
{upload.port.properties.vid} = 0x2341
{upload.port.properties.rev} = 0x0104
{upload.port.properties.interface} = 1
{upload.port.properties.serialNumber} = EBEABFD6514D32364E202020FF10181E
{upload.port.identificationPrefs.pid} = 0x8043
{upload.port.identificationPrefs.vid} = 0x2341

If this spec requires the IDE to copy all the JSON fields, even those it doesn't expect or understand, it could help provide some level of forward compatibility for an IDE to still work with future board packages which might transmit more JSON fields and need them accessible as variables in upload recipes.

While it's probably still too soon to talk much about Pluggable Serial Monitor (or perhaps Pluggable Generic Monitor - providing a way for the board package to include its own GUI-based tool to visualize data the port transmits), if you're concerned about the names, consider these "upload.port" variables will also need to be usable for the recipes used to run pluggable monitor utilities.

Personally, I'm not very concerned about which words & names are chosen. I don't really mind if it's called "upload" and gets used for serial monitor or other non-upload recipes. But if that is a concern, now would be a good time to think of the word(s) you want to use. Again, I am not very concerned with these names. My main focus is on the end user experience.

PaulStoffregen commented 3 years ago

@cmaglie - At the risk of being redundant, the "Upload with Pluggable discovery" section still mentioned "prefs" which was renamed to "properties".

I am hoping you will consider expanding the metadata requirement so that all of the "port" fields are exactly mirrored into corresponding "upload.port" variables which can be used in boards.txt and platform.txt.

For the example JSON:

{
  "eventType": "add",
  "port": {
    "address": "/dev/ttyACM0",
    "label": "ttyACM0",
    "protocol": "serial",
    "protocolLabel": "Serial Port (USB)",
    "properties": {
      "pid": "0x804e",
      "vid": "0x2341",
      "serialNumber": "EBEABFD6514D32364E202020FF10181E",
      "name": "ttyACM0"
    }
  }
}

the variables usable in upload (and in the future, other recipes like pluggable monitors) would become:

{upload.port.address} = /dev/ttyACM0
{upload.port.label} = ttyACM0
{upload.port.protocol} = serial
{upload.port.protocolLabel} = Serial Port (USB)
{upload.port.properties.pid} = 0x804e
{upload.port.properties.vid} = 0x2341
{upload.port.properties.serialNumber} = EBEABFD6514D32364E202020FF10181E
{upload.port.properties.name} = ttyACM0
{serial.port} = /dev/ttyACM0                # for backward compatibility
{serial.port.file} = ttyACM0                # only because protocol=serial

Might also be worthwhile to add a requirement for which characters discovery utilities are allowed to use in the field names. Using a '.' in any field name should be forbidden.

cmaglie commented 3 years ago

@cmaglie - At the risk of being redundant, the "Upload with Pluggable discovery" section still mentioned "prefs" which was renamed to "properties".

Uhm.. I don't see any prefs anymore

I am hoping you will consider expanding the metadata requirement so that all of the "port" fields are exactly mirrored into corresponding "upload.port" variables which can be used in boards.txt and platform.txt.

yes, I'm writing it down right now to see how it looks, give me 10 mins ;-)

cmaglie commented 3 years ago

Uhm.. I don't see any prefs anymore

oh nevermind I found it, I've changed the specification, now all port metadata are copied in the upload preferences. 👍🏼

cmaglie commented 3 years ago

@matthijskooijman @PaulStoffregen I've pushed a bunch of fixes to the RFC. It took quite a while but it turned out very well, I hope I've addressed all the concerns with this revision.

Even if the specification is not 100% complete, I see that it's starting to settle down. I would like to start implementing it from next week, maybe beginning from the protocol implementation that is the part of the RFC that seems more "stable" and leaving the parts that are most discussed, like installation and upload, as last so we can iterate a bit more.

PaulStoffregen commented 3 years ago

Yes, agreed, time for implementation.

Is it still too soon to talk of pluggable serial monitor? Or pluggable visualization monitor?

cmaglie commented 3 years ago

I'm closing this RFC as-is for now, if we found that we need to change something we will release an updated version later after the base implementation is done.

cmaglie commented 3 years ago

Is it still too soon to talk of pluggable serial monitor? Or pluggable visualization monitor?

No, it's not too soon, I'd like to write down something in the coming days. @PaulStoffregen I know that you already implemented a pluggable monitor for the Arduino IDE, I think that you already faced some critical issues. If you have insights or suggestions, may you open a new issue in this repo and write it down so we can start discussing it?

PaulStoffregen commented 3 years ago

If you have insights or suggestions, may you open a new issue in this repo and write it down so we can start discussing it?

Ok, here it is. https://github.com/arduino/tooling-rfcs/issues/4

PaulStoffregen commented 2 years ago

@cmaglie - How would you feel about extending Board Identification to allow JSON properties to match the menu options, so arduino-cli could know more of the FQBN?

VENDOR:ARCHITECTURE:BOARD_ID[:MENU_ID=OPTION_ID[,MENU2_ID=OPTION_ID ...]]

As I understand Board Identification, today arduino-cli can at best report the "VENDOR:ARCHITECTURE:BOARD_ID" portion of FQBN, because the spec only defines board.txt entries for matching the main part of FQBN.

If the Board Identification part of spec were extended slightly to specify boards.txt entries such as

{boardname}.menu.{menuname}.{menuoption}.upload_port.{identifier}=Value {boardname}.menu.{menuname}.{menuoption}.upload_port.#.{identifier}=Value

then a discovery tools capable of detecting which menu options where used could report properties to match those lines. Future arduino-cli could use this property matching to give a more complete FQBN for the detected hardware. Then in an even farther future, the IDE could utilize the more specific FQBN to initialize those menu options when a user clicks the port+board from the toolbar's drop-down list.

cmaglie commented 2 years ago

@cmaglie - How would you feel about extending Board Identification to allow JSON properties to match the menu options, so arduino-cli could know more of the FQBN?

Looks reasonable, I'll add this proposal to my backlog.