Hardware flow control for serial interfaces

rtvd commented 12 months ago

Here is my attempt at adding support for hardware flow control in serial interfaces. CTS and RTS can be enabled separately.

maximeborges commented 12 months ago

Sounds like a nice first step. I'm not sure if I have any device ready to test that, did you test it on your side? Ideally we should also move the pins used for CTS and RTS into the UART, like for RX and TX. but I think we might be missing some foundations for doing that properly, I was looking for examples in the other STM HAL, and astonishingly enough, only the L1 HAL supports CTS/RTS. One could take some inspiration from it or from HAL of other manufacturers to implement some type shenanigans for that, but I think this could come as a second PR later on.

rtvd commented 12 months ago

Hi @maximeborges

I have a device (that's why I am interested in adding this feature) but I have a bit of a problem trying the new version of stm32f7xx-hal with it. Strangely, I can build my local version of stm32f7xx-hal for that device but when I build my own project referencing the locally checked out stm32f7xx-hal I get a mysterious error:

error: assigning to `&T` is undefined behavior, consider using an `UnsafeCell`
   --> /home/rtvd/src/ALIEN/stm32f7xx-hal/src/serial.rs:428:22
    |
428 |             unsafe { ptr::write_volatile(&(*USART::ptr()).tdr as *const _ as *mut _, byte) }
    |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: for more information, visit <https://doc.rust-lang.org/book/ch15-05-interior-mutability.html>
    = note: `#[deny(invalid_reference_casting)]` on by default

error: assigning to `&T` is undefined behavior, consider using an `UnsafeCell`
   --> /XXX/stm32f7xx-hal/src/spi.rs:426:29
    |
426 | /                             ptr::write_volatile(
427 | |                                 &self.dr as *const _ as *mut _,
428 | |                                 word,
429 | |                             );
    | |_____________________________^
...
458 | / impl_instance!(
459 | |     pac::SPI1 {
460 | |         regs: (apb2, spi1rst, spi1en),
461 | |         pins: {
...   |
554 | |     }
555 | | );
    | |_- in this macro invocation
    |
    = note: for more information, visit <https://doc.rust-lang.org/book/ch15-05-interior-mutability.html>
    = note: this error originates in the macro `impl_instance` (in Nightly builds, run with -Z macro-backtrace for more info)

I am new to rust and this one looks bewildering. It appears to suggest that I need an unsafe { } block but that code already is in that block:

            // NOTE(unsafe) atomic write to stateless register
            // NOTE(write_volatile) 8-bit write that's not possible through the svd2rust API
            unsafe { ptr::write_volatile(&(*USART::ptr()).tdr as *const _ as *mut _, byte) }
            Ok(())

rtvd commented 12 months ago

I think I understand your point about needing RTS and CTS pins to be declared. After all, there may be another device configured that would also use those pins, which is not OK. So would it be fair to say that RTS and CTS pins should be declared as type parameters but should be optional when passed to the serial interface? If the relevant pin is passed - RTS / CTS is on, otherwise - there is no flow control?

I would also need to add lines similar to these:

impl PinTx<USART1> for gpio::PA9<Alternate<7>> {}
impl PinTx<USART1> for gpio::PB6<Alternate<7>> {}
impl PinTx<USART2> for gpio::PA2<Alternate<7>> {}
impl PinTx<USART2> for gpio::PD5<Alternate<7>> {}

I suppose I can look these up from stm32cubeide. However, I do not quite understand the meaning of numbers in Alternate<xxx>. The source says:

/// Some alternate mode (type state)
pub struct Alternate<const A: u8, Otype = PushPull>(PhantomData<Otype>);

But the meaning of A is not clear at all. Do you know what it is?

maximeborges commented 12 months ago

I have a device (that's why I am interested in adding this feature) but I have a bit of a problem trying the new version of stm32f7xx-hal with it. Strangely, I can build my local version of stm32f7xx-hal for that device but when I build my own project referencing the locally checked out stm32f7xx-hal I get a mysterious error:

Could you make a simple repro case? Also which chip are you using?

maximeborges commented 12 months ago

For reference, this builds fine on my side: https://github.com/maximeborges/stm32f7-demo-project I am using Rust 1.72

maximeborges commented 12 months ago

So would it be fair to say that RTS and CTS pins should be declared as type parameters but should be optional when passed to the serial interface? If the relevant pin is passed - RTS / CTS is on, otherwise - there is no flow control?

Usually people are using some macro to define all possible combinations of pins- with or without CTS/RTS for example in this case. This is a bit tricky to implement if you don't have much experience, but could be a great way to learn macros also (that's basically how I started a few years ago). Some other implementations are using Option but it brings some annoying implementation details, like having to specify a None::<SomeExistingPin> or some other non-sense sometimes (looking at you esp32-rs). I haven't followed much what HAL devs have been doing lately so maybe there is some better/easier solutions. If you have some time I would encourage you to look at how other HAL are doing such implementations; if you don't, we could go with a simple one right now. Since this lib dev hasn't been very active lately, I would say that anything that goes in the right direction is up for grabs.

I suppose I can look these up from stm32cubeide.

What I do is look at the datasheet of the chips in the whole family. Usually they are mostly the same (emphasis on mostly), so I first implement it for the one I'm using in my project, do some basic tests there, then quickly check the other chips if there is some variations. An example of the Alternate Functions definition from STM32F76x datasheet:

However, I do not quite understand the meaning of numbers in Alternate<xxx>. [...] But the meaning of A is not clear at all. Do you know what it is?

A corresponds to the Alter Functions number. In the screenshot above, you can see that USART2_CTS is defined on pin PA0 when it's set to Alternate Function 7.

BryanKadzban commented 12 months ago

The "assigning to &T is undefined behavior" error is new in nightly rust. The problem is that the code is trying to cast from a non-mutable reference to a mutable pointer. If USART::ptr() can return a mutable struct, such that its .tdr is also mutable, and so &mut (USART::ptr()).tdr is valid, that might be one way around this.

(I see this error in the continuous integration test on the PR that I just submitted, as well -- but only in the nightly run.)

BryanKadzban commented 12 months ago

I've figured out more about the new error.

The code (at least in one place) is trying to write a byte to the transmit data register on the usart peripheral. It's trying to get the compiler to emit a byte sized write operation because that's atomic, and the rest of the register needs to be preserved (the bits are all defined as reserved in the reference manual). But the svd2rust generated code only allows 32-but-wide read/modify/write sequences.

And the pointer to the RegisterBlock is const, so doing an offset and making it mutable is still undefined behavior.

It may work to use inline assembly here, or add an API to svd2rust, or find another way to generate the address of the tdr register. I don't think an UnsafeCell / RefCell will help (like the error message says) because the different calls to the ptr() associated fn will get different ref cells, and won't share the ref count.

It would be nice if the bits() method from svd2rust could be made safe for a case like this where the bit field covers 8 or 16 bits, and have the generated code use an 8 or 16 bit write to memory...

rtvd commented 11 months ago

I see how to add CTS and RTS pins and was about to wire them in somehow but after a bit of thinking about it I am not sure which approach is nice.

An USART/UART, depending on its capabilities, can run in one of several modes. Some of those modes do not use anything other than TX and RX. Some use CTS or RTS or both. There is also an RS485 mode which uses RTS pin but calls it "DE". Some UARTs can do IrDA or SmartCard protocol.

The first thought was to add an enum UARTMode which would select between the modes, but not every mode is supported by every UART.

The second though was that each mode can be a trait and it may be possible to assign those traits to different UARTs. That would ensure that the correct modes of operation with the correct pins correspond to appropriate UARTs. However, at some point the mode needs to be used to configure the UART and that has to be generic. Traits are not classes, however. So I cannot declare a generic UARTMode with a method "enable" and have traits which implement it (with their specific implementations of "enable"). Can I?

Or perhaps I can have an enum UARTMode parametrised by UART instance and somehow link mode traits to its values so that those values could exist only for some of UARTs?

maximeborges commented 11 months ago

I think the way the L4 HAL is doing it might be the way to go. This means that one would have to re-configure the UART driver with the right pins to change the mode, but I think that would be the only "safe" way of doing it without messing with the borrow checker. Since it'd be nice for this lib to catch up with the others on so many things already, and particularly with the v1.0 of embedded-hal, I think we can postpone implementing the PINS generic like they did and just go with this PR as is, if that works for you. I'm getting a H7 devboard in the next few days, so I'll focus a bit on it for a few weeks, but I would hopefully learn some nice things from the H7 HAL to improve the F7 one.

maximeborges commented 11 months ago

Oh also, the implementation from the esp32-hal could be a nice source of inspiration: https://github.com/esp-rs/esp-hal/blob/main/esp-hal-common/src/uart.rs

rtvd commented 11 months ago

I think it would be great to get it done properly. The current version of the change is rather limiting.

There is an idea to introduce traits which correspond to different capabilities of UARTs and declare which UARTs implement which traits. The traits would then provide customised constructors as a replacement for the current and generic "new".

Meanwhile, there is something I do not quite understand. The specification for stm32f7xx devices says that the same UART can be used with multiple TX and RX pins. The code matches that. For example, UART4 can use PA0 and PC10 pins as TX, and PA1 and PC11 pins as RX. When I look in STM32 IDE, I can configure only PA1 for RX and PA0 for TX. There is no PC10 and PC11 in sight. So frankly, I do not understand how this is supposed to work. But I do suspect that you cannot, for example, configure UART4 with PA1 for RX and PC10 for TX. It can probably work only for specific pairs: PA0+PA1 and PC10+PC11. The current code, the way I am reading it, would allow to construct UART4 with any combinations of RX and TX pins.

rtvd commented 11 months ago

I've tried a few things but so far it does not work the way I wanted it to.

The general idea was the following:

The modes of operation can be described by an enumeration: Async, Sync, IrDA, etc. and the asynchronous mode can be parameterised by yet another enumeration which defines flow control.
Not all modes of operation are available for every single U(s)ART. So the idea was to hide the constructor of Serial and declare two traits: UART and USART. UART is to be implemented by all Serial interfaces, while USART would be implemented only by USART* ones. The traits would then provide special constructors for different modes of operation.

However, type inference does not work well (no idea why) and the API becomes slightly confusing. An example of it is my attempt at modifying serial_echo.rs. There, an additional import of UART trait is needed to make Serial::new_async_uart_no_flwctl be visible. That is not intuitive. Also, the code actually fails to compile:

error[E0282]: type annotations needed
  --> examples/serial_echo.rs:34:18
   |
34 |     let serial = Serial::new_async_uart_no_flwctl(
   |                  ^^^^^^ cannot infer type for type parameter `PINS` declared on the struct `Serial`

This is not clear to me. Surely it should be able to infer the type as the call passes the pins into the method:

let serial = Serial::new_async_uart_no_flwctl(
        p.USART1,
        (tx, rx),
        &clocks,
        serial::Config {
            // Default to 115_200 bauds
            ..Default::default()
        },
    );

Also, it is not obvious how to properly borrow CTS and RTS pins in case they were provided. The pins won't be "used" for anything but, ideally, the borrow mechanics must be present.

Finally, the idea that any combinations of RX, TX, CTS, and RTS pins are valid bothers me a bit. It feels like only explicitly vetted combinations should be allowed.

rtvd commented 11 months ago

I have added a bit more code. For example, there it a way to configure an UART in IrDA mode.

Please let me know what you think of the approach. Is it horrible or not?

For the time being, I have not updated most of the examples. The only one I touched was serial_echo.

maximeborges commented 11 months ago

The specification for stm32f7xx devices says that the same UART can be used with multiple TX and RX pins. The code matches that. For example, UART4 can use PA0 and PC10 pins as TX, and PA1 and PC11 pins as RX. When I look in STM32 IDE, I can configure only PA1 for RX and PA0 for TX. There is no PC10 and PC11 in sight.

Which MCU were you using when looking for the pins in stm32ide? The datasheet for the stm32f767 seems to say that both pins are available for the peripheral, and from experience, it should just work as expected.

We could expect one or two reference of the family to NOT match that, but I didn't dive into that yet.

But I do suspect that you cannot, for example, configure UART4 with PA1 for RX and PC10 for TX. It can probably work only for specific pairs: PA0+PA1 and PC10+PC11. The current code, the way I am reading it, would allow to construct UART4 with any combinations of RX and TX pins.

As far as I remember, pins are working pretty independent of each others when there is multiple combinations possible like here. If you suspect this to still not be the case, I can try to bring out a devboard and test it.

[...] an additional import of UART trait is needed to make Serial::new_async_uart_no_flwctl be visible. That is not intuitive.

The usual way of dealing with that is to add the traits to prelude.rs. There you can see multiple External traits publicly imported, and if you look into some of the examples, you can see that it is importing all the traits like that:

use stm32f7xx_hal::prelude::*;

maximeborges commented 11 months ago

Please let me know what you think of the approach.

Did you have a look at the STM32L1 implementation? In particular the bit here: https://github.com/stm32-rs/stm32l4xx-hal/blob/master/src/serial.rs#L1154-L1177, and how you can use it during construction here: https://github.com/stm32-rs/stm32l4xx-hal/blob/master/src/serial.rs#L298C22-L308 They are basically defining the capabilities of the peripherals based on the provided PINS during construction, and with the defined const attached to it you can check in the constructor what to do. Also since the provided PINS tuple moves the pins in the peripheral struct, you can make sure that they are still owned by the peripheral. If you want to retrieve them you could call release(). With your current implementation, you are basically dropping the pins in the constructor, and to make use of them elsewhere, you would need to steal() them from the PAC.

rtvd commented 11 months ago

I have stm32f722ze.

Finally I see how it works. In STM32 CubeIDE when I enable UART4 the pins are shown as PA0+PA1. There is no obvious way to change it there. However, if I click on PC10 pin then I can assign it to be UART4_TX. In that case I end up with UART4 configured as PC10+PA1.

rtvd commented 11 months ago

The implementation in STM32L1 looks interesting. I think I understand most of it but I am not sure the solution is optimal either.

Here are my concerns:

Intuitively, the mode of operation of a UART as well as its configuration should determine which pins are required. Here, the tuple of pins determines the mode and part of its configuration.
FLOWCTL, DEM, and HALF_DUPLEX in Pin determine the state but not all combinations are valid. And not all combinations are covered. For example, what if it is RS232 flow control but only CTS pin has to be used?
Suppose we want to support synchronous mode. This would mean that TX, RX, and CK (needs to be added) pins are to be used. But the mode is available only for USARTs, not for UARTs. How would that be done?
With Synchronous mode at least one coud argue that USARTs would have a CK pin while UARTs won't. But what about "SmartCard" and "SmartCard with Card Lock"? They are available only in USARTs and they use only TX pin.

No solution that comes to my mind feels satisfactory so far. The existing one is flexible and provides a bit of rigor, but fails to do what's right with borrowing. Also, the amount of code is non-trivial.

rtvd commented 11 months ago

By the way, I have added this as in-code comments, but should say it in here too: it seems UART8 was missing from the original code. Shall I add it?

maximeborges commented 11 months ago

By the way, I have added this as in-code comments, but should say it in here too: it seems UART8 was missing from the original code. Shall I add it?

Sure. Can you put it in another PR?

rtvd commented 11 months ago

Also, in my already rather large PR there is a controversial change: I have renamed template parameters USART to just U in several places. This was to free up USART for the name of a trait but also to avoid a perception that it has to be specifically USART rather than some kind of UART.

Are you OK with that or not really?

maximeborges commented 11 months ago

Also, in my already rather large PR there is a controversial change: I have renamed template parameters USART to just U in several places. This was to free up USART for the name of a trait but also to avoid a perception that it has to be specifically USART rather than some kind of UART.

Are you OK with that or not really?

Yes all good for me. It's just a generic argument name so nothing critical or API-breaking.

maximeborges commented 11 months ago

The implementation in STM32L1 looks interesting. I think I understand most of it but I am not sure the solution is optimal either.

I agree that it's not optimal; also F7 seems to support more modes. I guess that's also why nobody really tried to implement that properly before, there is a lot of stuff to be implemented ^^'

Intuitively, the mode of operation of a UART as well as its configuration should determine which pins are required. Here, the tuple of pins determines the mode and part of its configuration.

I kinda like the approach from the esp32-hal here, by requiring to provide a tuple of Option<PIN>. This would allow to properly take care of the lifetimes of the pins, and allow more flexible configuration in the different modes. In any case I think we should have specific configuration for each mode, but with the PINS tuple defining part of it already. Basically forcing the peripheral setup to be valid by design at creation, and not being able to fail because of conflicting configuration.

FLOWCTL, DEM, and HALF_DUPLEX in Pin determine the state but not all combinations are valid. And not all combinations are covered. For example, what if it is RS232 flow control but only CTS pin has to be used?

About this approach, we don't have to be limited that by just 3 consts also; we could have RTS and CTS be separated, if it makes sense. I'm all up for flexibility, but the amount of work for implementing it increases with it, so it depends how much time we want to spend for a first implementation.

Suppose we want to support synchronous mode. This would mean that TX, RX, and CK (needs to be added) pins are to be used. But the mode is available only for USARTs, not for UARTs. How would that be done?

I think we could split the UART trait into UART and USART, so we could implement them only on relevant peripherals, i.e. :

pub trait UART<U: Instance, PINS: Pins<U>>;

impl <PINS: Pins<UART4>> UART<UART4, PINS> for Serial<UART4, PINS> {}
impl <PINS: Pins<UART5>> UART<UART5, PINS> for Serial<UART5, PINS> {}
impl <PINS: Pins<UART7>> UART<UART7, PINS> for Serial<UART7, PINS> {}

impl <PINS: Pins<USART1>> UART<USART1, PINS> for Serial<USART1, PINS> {}
impl <PINS: Pins<USART2>> UART<USART2, PINS> for Serial<USART2, PINS> {}
impl <PINS: Pins<USART3>> UART<USART3, PINS> for Serial<USART3, PINS> {}
impl <PINS: Pins<USART6>> UART<USART6, PINS> for Serial<USART6, PINS> {}

becomes:

pub trait UART<U: Instance, PINS: Pins<U>>;
pub trait USART<U: Instance, PINS: Pins<U>>: UART<U, PINS>;

impl <PINS: Pins<UART4>> UART<UART4, PINS> for Serial<UART4, PINS> {}
impl <PINS: Pins<UART5>> UART<UART5, PINS> for Serial<UART5, PINS> {}
impl <PINS: Pins<UART7>> UART<UART7, PINS> for Serial<UART7, PINS> {}
impl <PINS: Pins<USART1>> UART<USART1, PINS> for Serial<USART1, PINS> {}
impl <PINS: Pins<USART2>> UART<USART2, PINS> for Serial<USART2, PINS> {}
impl <PINS: Pins<USART3>> UART<USART3, PINS> for Serial<USART3, PINS> {}
impl <PINS: Pins<USART6>> UART<USART6, PINS> for Serial<USART6, PINS> {}

impl <PINS: Pins<USART1>> USART<USART1, PINS> for Serial<USART1, PINS> {}
impl <PINS: Pins<USART2>> USART<USART2, PINS> for Serial<USART2, PINS> {}
impl <PINS: Pins<USART3>> USART<USART3, PINS> for Serial<USART3, PINS> {}
impl <PINS: Pins<USART6>> USART<USART6, PINS> for Serial<USART6, PINS> {}

(not sure if the supertrait would actually be useful in trait USART<U: Instance, PINS: Pins<U>>: UART<U, PINS>, maybe that would to share some basic setup functions between them or something)

With Synchronous mode at least one coud argue that USARTs would have a CK pin while UARTs won't. But what about "SmartCard" and "SmartCard with Card Lock"? They are available only in USARTs and they use only TX pin.

Since this peripheral is more complex than most of the other UART implementation I've seen, it's a bit hard to imagine what would be the ideal way of settings things up. I don't have much experience with anything other than basic RX/TX and some RTC/CTS. Coming back to this point, maybe the best/easiest way to provide pins would be to have a struct of`Options so one could None the ones they don't need in their specific setup, but I'm not convinced by how robust that would be (i.e. how easy would it be to describe invalid config). Another approach I've seen (not sure for which peripheral nor which HAL that was), was to have a big macro that would define all the pins available for each peripheral, and that would generate a list of all the valid tuples for each mode. Not sure about the details, it's been a few years.

rtvd commented 11 months ago

I'll read through the code of esp32-hal. In the meantime, the more trivial changes are in PRs #205 and #206.

I think they should be followed by yet another trivial change which would bring in CTS, RTS, and CK pin definitions. Although, let's do it one step at a time.

stm32-rs / stm32f7xx-hal

Hardware flow control for serial interfaces #202