dmitrystu / libusb_stm32

Lightweight USB device Stack for STM32 microcontrollers
Apache License 2.0
696 stars 160 forks source link

STM32F411 port can't send more than 64 bytes #135

Closed Eqqm4n closed 2 months ago

Eqqm4n commented 3 months ago

Hello all- I have successfully ported this project over to both the STM32L062 and the STM32F411. However, on the STM32F411 I cannot send data back out to the host if it exceeds CDC_DATA_SZ. I have a loop that iterates over the outgoing data buffer to send CDC_DATA_SZ chunks out, however, after the first successful transmission of CDC_DATA_SZ bytes then line 349 in usbd_stm32f429_otgfs.c always exits with this being true-

    /* no enough space in TX fifo */
    if (len > epi->DTXFSTS) return -1;

So for some reason the TX fifo doesn't get cleared out on successful transmissions. My edit of the loopback code is as such:

static void cdc_loopback(usbd_device *dev, uint8_t event, uint8_t ep) {
    int _t, _i;
    uint32_t tx_errors = 0x00;

    if (fpos <= (sizeof(fifo) - CDC_DATA_SZ)) {
        // WARNING : This code breaks if a packet larger than CDC_DATA_SZ
        //  arrives since we won't correctly account for having to deal 
        //  with split data
        _t = usbd_ep_read(dev, CDC_RXD_EP, &fifo[fpos], CDC_DATA_SZ);
        if (_t > 0) {
            memmove((void *)p_msg_in, &fifo[fpos], _t);
            fpos += _t;
            memmove(&fifo[0], &fifo[_t], fpos - _t);
            fpos -= _t;
            OrcaUSBCallback();
            if (CDC_TX_RDY != 0x00) {
                    _i = 0x00;
                    while (CDC_TX_RDY != 0x00) {
                        _t = usbd_ep_write(dev, CDC_TXD_EP, &p_msg_out[_i], 
                            (CDC_TX_RDY < CDC_DATA_SZ) ? CDC_TX_RDY : CDC_DATA_SZ);
                        if (_t > 0x00) {
                            CDC_TX_RDY -= _t;
                            _i += _t;
                        } // end if
                        else {
                            ++tx_errors;
                            if (tx_errors > 0x10) {
                                CDC_TX_RDY = 0x00;
                                //break;
                            } // end if
                        } // end else
                    } // end while
            } // end if
        }
    }
}

For comparison, here is the similar code I am using for a different project on the L062. On this processor the loop correctly sends out the data regardless of size:

/* CDC loop callback. Both for the Data IN and Data OUT endpoint */
static void cdc_loopback(usbd_device *dev, uint8_t event, uint8_t ep) {
    int _t, _i;
    int32_t msg_size;

    (void)event;
    (void)ep;
    if (fpos <= (sizeof(fifo) - CDC_DATA_SZ)) {
        _t = usbd_ep_read(dev, CDC_RXD_EP, &fifo[fpos], CDC_DATA_SZ);
        if (_t > 0) {
            fpos += _t;
            CDC_Receive_Callback(fifo, _t);
        }
    }
    if (fpos > 0) {
        if (_t > 0) {
            memmove(&fifo[0], &fifo[_t], fpos - _t);
            fpos -= _t;
        }
    }
    if (CDC_TX_RDY != 0x00) {
        CDC_TX_RDY = 0x00;
        msg_size = pull_buffered_data();
        if (msg_size != 0x00) {
            _i = 0x00;
            while (msg_size != 0x00) {
                _t = usbd_ep_write(dev, CDC_TXD_EP, &pBuf[_i], 
                    (msg_size < CDC_DATA_SZ) ? msg_size : CDC_DATA_SZ);
                if (_t > 0x00) {
                    msg_size -= _t;
                    _i += _t;
                } // end if
            } // end while
        } // end if
        else {
            usbd_ep_write(dev, CDC_TXD_EP, &pBuf[0x00], pBuf[0x00]);
        } // end else  
        peripheral_tx_callback();
    } // end if
}

Getting this to work on the L062 I don't recall having to edit anything in the files beyond the loopback code in cdc_loop.c. It could be that I'm simply using the wrong otg file but the git project page indicates that usbd_stm32f429_otgfs.c is correct for the F411. Maybe there is some configuration required for the F411 that was absent in the L062 but either way I'm at a loss as to how to continue to troubleshoot this issue. I would hope that I don't have to manually force the endpoint TX buffer to clear itself out as presumably this happens on its own.

Eqqm4n commented 3 months ago

As a side note, if I do a search on "DTXFSTS" there are no other references to it found in the project other than that line checking it. So presumably this means the hardware itself is supposed to be clearing that out after a successful transmission, but it doesn't appear to do so even though the data was correctly received by the host.

dmitrystu commented 3 months ago

Have a problen with demo code on STM32F411DISCO. Need a time to investigate.

Eqqm4n commented 3 months ago

Have a problem with demo code on STM32F411DISCO. Need a time to investigate.

Hi dmitrystu! Glad to see you are ok, as I had thought you were living in Ukraine. The code was altered to match more of what was working under the L062 chip as follows:

/* CDC loop callback. Both for the Data IN and Data OUT endpoint */
static void cdc_loopback(usbd_device *dev, uint8_t event, uint8_t ep) {
    int _t, _i;

    (void)event;
    (void)ep;
    if (fpos <= (sizeof(fifo) - CDC_DATA_SZ)) {
        _t = usbd_ep_read(dev, CDC_RXD_EP, &fifo[fpos], CDC_DATA_SZ);
        if (_t > 0) {
            memmove((void *)&p_msg_in[0x00], &fifo[fpos], _t);
            OrcaUSBCallback();
            fpos += _t;
        }
    }
    if (fpos > 0) {
        if (_t > 0) {
            memmove(&fifo[0], &fifo[_t], fpos - _t);
            fpos -= _t;
        }
    }
    if (CDC_TX_RDY != 0x00) {
        _i = 0x00;
        while (CDC_TX_RDY != 0x00) {
            _t = usbd_ep_write(dev, CDC_TXD_EP, &p_msg_out[_i], 
                (CDC_TX_RDY < CDC_DATA_SZ) ? CDC_TX_RDY : CDC_DATA_SZ);
            if (_t > 0x00) {
                CDC_TX_RDY -= _t;
                _i += _t;
            } // end if
        } // end while
    } // end if
} // end cdc_loopback

We discovered that the proximate cause is that if the host sends less than 64 bytes, then we run into the issues described above, even though this does not happen running this code on the L062. I'm at a loss as to why this matters since the code is designed to allow usbd_ep_read to return less than CDC_DATA_SZ in size. If the host sends less than 64 bytes with the above code arrangement the F411 consistently gets into a state where its internal TX fifo never clears so ep_write always returns -1.

dmitrystu commented 3 months ago

Now it runs smoothly for me

$ ./linux-serial-test -p /dev/ttyACM0 -s
Linux serial test app
/dev/ttyACM0: count for this session: elapsed=10 sec, rx=5312448, tx=5324800, rx err=0
/dev/ttyACM0: transfer rate rx=531244 cps, 4249952 bps, tx=532480 cps, 4259840 bps.
/dev/ttyACM0: TIOCGICOUNT: ret=0, rx=0, tx=0, frame = 0, overrun = 0, parity = 0, brk = 0, buf_overrun = 0
/dev/ttyACM0: count for this session: elapsed=20 sec, rx=11207616, tx=11223040, rx err=0
...........
/dev/ttyACM0: transfer rate rx=558418 cps, 4467344 bps, tx=558464 cps, 4467712 bps.
/dev/ttyACM0: TIOCGICOUNT: ret=0, rx=0, tx=0, frame = 0, overrun = 0, parity = 0, brk = 0, buf_overrun = 0
/dev/ttyACM0: count for this session: elapsed=300 sec, rx=167568576, tx=167578624, rx err=0

Before, I had something like this.

$ ./linux-serial-test -p /dev/ttyACM0 -s
Linux serial test app
No data received for 2.0s. No data transmitted for 2.0s.
No data received for 3.0s. No data transmitted for 3.0s.
No data received for 4.0s. No data transmitted for 4.0s.
No data received for 5.0s. No data transmitted for 5.0s.
No data received for 6.0s. No data transmitted for 6.0s.
No data received for 7.0s. No data transmitted for 7.0s.
No data received for 8.0s. No data transmitted for 8.0s.
No data received for 9.0s. No data transmitted for 9.0s.
/dev/ttyACM0: count for this session: elapsed=10 sec, rx=524288, tx=540672, rx err=0
/dev/ttyACM0: transfer rate rx=52428 cps, 419424 bps, tx=54067 cps, 432536 bps.
/dev/ttyACM0: TIOCGICOUNT: ret=0, rx=0, tx=0, frame = 0, overrun = 0, parity = 0, brk = 0, buf_overrun = 0
No data received for 10.0s. No data transmitted for 10.0s.
No data received for 11.0s. No data transmitted for 11.0s.
No data received for 12.0s. No data transmitted for 12.0s.
No data received for 13.0s. No data transmitted for 13.0s.
No data received for 14.0s. No data transmitted for 14.0s.
No data received for 15.0s. No data transmitted for 15.0s.
No data received for 16.0s. No data transmitted for 16.0s.
No data received for 17.0s. No data transmitted for 17.0s.
No data received for 18.0s. No data transmitted for 18.0s.
No data received for 19.0s. No data transmitted for 19.0s.
/dev/ttyACM0: count for this session: elapsed=20 sec, rx=524288, tx=540672, rx err=0
/dev/ttyACM0: transfer rate rx=26214 cps, 209712 bps, tx=27033 cps, 216264 bps.
/dev/ttyACM0: TIOCGICOUNT: ret=0, rx=0, tx=0, frame = 0, overrun = 0, parity = 0, brk = 0, buf_overrun = 0
No data received for 20.0s. No data transmitted for 20.0s.
No data received for 21.0s. No data transmitted for 21.0s.
dmitrystu commented 3 months ago

Well, maybe I'm misunderstanding your code a bit. On every TXC or RXC event:

    if (fpos <= (sizeof(fifo) - CDC_DATA_SZ)) {
        _t = usbd_ep_read(dev, CDC_RXD_EP, &fifo[fpos], CDC_DATA_SZ);
        if (_t > 0) {
            memmove((void *)&p_msg_in[0x00], &fifo[fpos], _t);
            OrcaUSBCallback();
            fpos += _t;
        }
    }

Well, if you have enough space in the FIFO, you will read an endpoint into fifo[] at fpos, then move to the output buffer and make a call if successful. Why memove? I It's possible to read data from ep directly to your buffer. What if there is no space? In that case, _t will be uninitialized.

    if (fpos > 0) {
        if (_t > 0) {
            memmove(&fifo[0], &fifo[_t], fpos - _t);
            fpos -= _t;
        }
    }

You are shifting the remaining FIFO data from the last written position to the start. For what purpose? Remember, _t may be uninitialized. This may cause a buffer overflow.

    if (CDC_TX_RDY != 0x00) {
        _i = 0x00;
        while (CDC_TX_RDY != 0x00) {
            _t = usbd_ep_write(dev, CDC_TXD_EP, &p_msg_out[_i], 
                (CDC_TX_RDY < CDC_DATA_SZ) ? CDC_TX_RDY : CDC_DATA_SZ);
            if (_t > 0x00) {
                CDC_TX_RDY -= _t;
                _i += _t;
            } // end if
        } // end while
    } // end if

You start pushing data into the TX endpoint, but only one hardware FIFO buffer of size CDC_DATA_SZ is allocated for the TX endpoint by default. You can set up the TX endpoint with the USB_EPTYPE_DBLBUF option to have a double FIFO. By the way, DIEPINT_XFRC must be cleared on TX completion. This is done inside event_poll.

Eqqm4n commented 3 months ago

Hello- thank you for looking into this. Regarding the first point, we did originally overwrite the stack's buffer with our outgoing message but this seems like a bad idea if we ever want to support incoming streams larger than CDC_DATA_SZ, so it seemed safer to keep everything distinct. Regarding the second point, as a reminder this is the original cdc loopback code from the demo :

/* CDC loop callback. Both for the Data IN and Data OUT endpoint */
static void cdc_loopback(usbd_device *dev, uint8_t event, uint8_t ep) {
    int _t;
    if (fpos <= (sizeof(fifo) - CDC_DATA_SZ)) {
        _t = usbd_ep_read(dev, CDC_RXD_EP, &fifo[fpos], CDC_DATA_SZ);
        if (_t > 0) {
            fpos += _t;
        }
    }
    if (fpos > 0) {
        _t = usbd_ep_write(dev, CDC_TXD_EP, &fifo[0], (fpos < CDC_DATA_SZ) ? fpos : CDC_DATA_SZ);
        if (_t > 0) {
            memmove(&fifo[0], &fifo[_t], fpos - _t);
            fpos -= _t;
        }
    }
}

So we left in the memmove from the original source since the intent appears to be that it is there to support incoming streams larger than CDC_DATA_SZ. If we're never allowing this to happen in our system then it seems logical to simply reset fpos back to zero every time we read data back from the endpoint.

Regarding the third point, so far double-buffering has not been needed for what we are doing. Unless I misunderstood the nature of your comment it sounds like no edits are recommended for that section. Thank you again for making an update, we look forward to checking it out.

dmitrystu commented 2 months ago

Add some changes to the demo for testing purposes to achieve various lengths for bulk IN transfer with sequential data. Tested with WSL and USBIP using linux_serial_test, and no problems were observed. You can look at the packets here.

Eqqm4n commented 2 months ago

We tested the changes from commit a79c8b5 and commit 68d7e8c and our problem with sub-CDC_DATA_SZ sized packets from the host went away, thanks again for responding to this so quickly!