joeycastillo / Sensor-Watch

A board replacement for the classic Casio F-91W wristwatch
Other
1.02k stars 210 forks source link

USB Improvements #344

Closed Hylian closed 3 months ago

Hylian commented 6 months ago

Testing:

Hylian commented 6 months ago

Sorry for the huge patch! USB serial was consistently hanging for me, and one thing led to another while digging into the issue. Since we're not running on an RTOS, TinyUSB seems a bit finicky about how we service it. But this setup seems fairly stable for me. Only tested against a Linux host, not macOS.

wryun commented 6 months ago

As a data point, when I did the 'what happens if we have no buffers' experiment here it mostly seemed to work: https://github.com/joeycastillo/Sensor-Watch/pull/212

The issue was I made it a blocking write (obviously wrong), and I believe this failed if there was something that needed to be done first (e.g. if you pasted a lot into the buffer, it wouldn't let you write anything because something was still sitting in read, so it would block forever).

This makes me wonder if it would be possible to do something like a blocking write with a 'now handle any reads' inside the write. i.e. basically make sure any read-type usb operation would work. I know this seems a bit wrong from the perspective of actually running the watch code properly, but USB mode is quite different... (and doesn't have the display attached).

EDIT: actually, getting my head back into this, https://github.com/micropython/micropython/blob/d68e3b03b1053a6de0c7eb28f5989132c138364b/ports/samd/mphalport.c does not have a separate write buffer and instead uses a timeout.

Hylian commented 6 months ago

Went back in and tested various ways of calling tud read directly inside _read(). They all resulted in dropped inputs, which was the original behavior I was seeing. Not sure on the root cause. It seems happy without the extra write buffering though, so I'm inclined to remove the write buffer.

Edit: IIRC, I messed around with with adding critical sections to prevent TC0/USB interrupt servicing, as well. And then also calling some of the internal tud servicing functions in a loop to keep kicking it. I don't think that went anywhere.

// Test: Using the read API as advertised
// Result: Drops inputs, usually takes sending multiple characters for a 
//         single input to go through.
int _read(int file, char *ptr, int len) {
    (void) file;

    if (ptr == NULL || len <= 0) {
        return -1;
    }

    if (tud_cdc_available() == 0) {
        return -1;
    }
    return tud_cdc_read((void *) ptr, len);
}

// Test: Try only reading a single char in _read()
// Result: Same as above
int _read(int file, char *ptr, int len) {
    (void) file;

    if (ptr == NULL || len <= 0) {
        return -1;
    }

    if (tud_cdc_available() == 0) {
        return -1;
    }

    char ret = tud_cdc_read_char();
    //tud_cdc_read_flush(); // Tested with and without flush.
    if (ret < 0) {
        return -1;
    }
    *ptr = ret;
    return 1;
}

// Test: Read a char at a time in a loop while FIFO has data
// Result: Same as above, maybe drops less frequently?
int _read(int file, char *ptr, int len) {
    (void) file;

    if (ptr == NULL || len <= 0) {
        return -1;
    }

    while (tud_cdc_available()) {
        char ret = tud_cdc_read_char();
        tud_cdc_read_flush();
        if (ret < 0) {
            return -1;
        }
        *ptr = ret;
        return 1;
    }
    return -1;
}

// Test: Circular buffer in cdc_task() as in original PR.
// Result: Works as expected.
Hylian commented 6 months ago

Changes:

Hylian commented 6 months ago

With the new stress command, I evaluated _write() across implementations and parameters:

Results: https://gist.github.com/Hylian/71d6e1c30b70c1a6d23671729d0b2865

Analysis:

With direct tud writes, characters will start dropping after a certain amount of writes. Increasing CFG_TUD_CDC_TX_BUFSIZE or increasing the delay between writes will make it go further before starting to drop characters. Given that the printf() is a blocking call, TinyUSB is only being serviced by the TC0 ISR, and the flush() calls within _write().

With the buffered approach (+ chunked 32 byte writes), writes are extremely consistent, with no dropped characters. However, the number of bytes that you can write in one app loop is clamped by CDC_WRITE_BUF_SZ.

Given this, I think keeping the write buffer is a reasonable option, as it has consistent behavior independent of the user app loop. The tradeoff being the limit on maximum characters per loop. I've adjusted the PR to bump CDC_WRITE_BUF_SZ to 1024, and bumped the TUD buffers back down to 64, as 128 isn't necessary.

EDIT: Alternative approaches would be to rate limit writes, come up with a smarter method of servicing tusb, or figure out if there's a tusb backpressure mechanism we can use (fifo size?). But I don't have a ton of time to dig into those right now, so I'd like to get this merged first.

wryun commented 6 months ago

With the buffered approach (+ chunked 32 byte writes), writes are extremely consistent, with no dropped characters

The way I read the results is that the double-buffered approach drops more characters for the same total buffer size in all situations ;)

True, which characters it drops within a loop are more predictable, but you could get that effect without the double buffering by introducing a counter (i.e. track how many chars you've written in that loop, stop writing at a certain size, then reset this counter at the next loop). I'm not seeing that the buffering adds anything. My instinct would be to prefer the micropython approach (i.e. write directly, but add a bit of a wait so that we can handle it when the only issue is that the code is pushing faster than the baud as opposed to something else blocking our writes).

wryun commented 6 months ago

(very much appreciate you did the tests, though, despite my whinging!)

Hylian commented 6 months ago

Haha, fair enough, I'll stop being lazy and see if I can get some sort of backpressure working.

Hylian commented 6 months ago

Update: I did get writes working fairly well using an exponential backoff delay in _write(), but I then ran into the other issue I was seeing- if you type into the console while it's writing, it can cause the whole stack to crash. I think that's why serializing reads and writes into cdc_task() helped prevent crashes.

wryun commented 6 months ago

Are you blocking indefinitely on the write now? I would backoff for a bit then give up, similar to the micropython approach, which should prevent any freeze-ups.

Of course, if you're seeing some other kind of crash...

(my assumption is that not ever servicing the reads - or some other usb interaction - can lead to the inability to write at all after the buffer is filled... i.e. if we wanted a truly blocking write we'd have to service any usb 'thing' inside the blocking write)

Hylian commented 6 months ago

The issue isn't the write per-se, but the moment a read comes in. I'm not really confident about figuring it out without attaching a debugger at this point, since I can't exactly printf debug over USB. :)

Looking at micropython's implementation for nrf, they actually do something similar: https://github.com/micropython/micropython/blob/9feb0689eeaca5ce88aedcc680f997a3b4d0221c/ports/nrf/drivers/usb/usb_cdc.c#L124

They serialize reads + writes in cdc_task(), and use a circular ring buffer for tx.

Hylian commented 6 months ago

Ok, I think I've come up with a decent solution to handle reads and writes outside the app loop. I set up TC1 and ran cdc_task() there at a lower prio than TC0. Main issue was if you spammed input during large prints, and the read fifo was not read from quickly enough, it would crash. After making tx defer to rx in cdc_task() and adding a delay to _write(), it doesn't crash even in my worst case stress tests.

wryun commented 6 months ago

Nice!

(Re micropython, you linked to the NRF port. If you go to the SAMD one I linked earlier, it appears they don't have an intermediate buffer: https://github.com/micropython/micropython/blob/d68e3b03b1053a6de0c7eb28f5989132c138364b/ports/samd/mphalport.c#L190 )

Hylian commented 6 months ago

Ready for review! After inspecting the micropython code, I decided to make both reads and writes use a circular buffer. I then serviced the buffer in the TC1 timer ISR, at a lower priority than TC0. Read/write flushing is now independent from the app loop. Reads are also handled in the middle of large writes, to not crash the stack. Tuned by calling stress and spamming keyboard inputs, until it stopped crashing.

WesleyAC commented 5 months ago

I don't have a lot of cycles to review this, but given that USB is AFAICT totally broken right now, and this code doesn't really touch anything outside of USB-land, I'm fairly inclined to lean towards just merging this. If anyone has concerns about that, speak up now! (cc @joeycastillo?)

@Hylian I am curious about the \r\n line endings in the output, is there a particular reason for that? If it's just preference, I'd personally prefer that we stay unixy :)

Hylian commented 5 months ago

@WesleyAC Thanks for taking a look!

/r/n is required/idiomatic, as most serial terminals (such as minicom) will not automatically insert a carriage return upon newline. In these clients, without a CR, the cursor will just move down a row without resetting the column position, and text will keep scrolling to the right.

Hylian commented 4 months ago

Ping :)

Is this good to be merged? If there are any specific tests you'd like to see, I can try and run them.

matheusmoreira commented 4 months ago

@WesleyAC

I am curious about the \r\n line endings in the output, is there a particular reason for that? If it's just preference, I'd personally prefer that we stay unixy :)

\r\n is actually the correct escape sequence as per teletypewriter semantics. Basically computing dates back to telegraphs, teleprinters and teletypewriters and everything thinks that they're talking to those old things literally to this day. Imagine an actual typewriter: \r is just like when you push the literal carriage back to make it return to the beginning of the paper margin while \n is like scrolling the paper down while maintaining the carriage position.

Linux kernel's terminal subsystem has options for automatically translating between \r\n and \n for the benefit of applications -- the so called cooked mode. This can be turned off, just like character echoing.

For more details:

TTY demystified

Hylian commented 4 months ago

@matheusmoreira Go for it! The more test time we can get on the patch, the better. :) In particular, I haven't really used it on my wrist much. None of the code should really run when not plugged into USB, but always good to check.

matheusmoreira commented 4 months ago

@Hylian Great, I'll go ahead and merge this into my branch then. I plan to merge as many pull requests as possible before flashing them into my watch for daily use.

This also needs tested-on-hw tag since you have already tested it:

Shell validated on Sensor Watch Blue w/ Linux host

matheusmoreira commented 4 months ago

Am now running this code on the watch. No issues so far.