leptun commented 2 years ago

Operating System

Windows 10

Board

stm32g474nucleo

Firmware

I was using the uac2_headset example code. I tried using it on a stm32g474nucleo board, but any mcu which uses stm32_fsdev should have this issue. commit cadfcd153e8fce804f38f0e19997c13bf6931a49

What happened ?

An assert is hit at runtime:

dcd_edpt_open 760: ASSERT FAILED

Which basically says that isochronous mode is not implemented in the stm32_fsdev driver.

Is there any interest in implementing the isochronous mode in this driver? Right now the audio class can't work on a large portion of the stm32 devices (basically the cheaper ones, where OTG is not present). I imagine there might be more classes that use this endpoint mode, but this was the only class I needed at the moment. I wish I was able to switch to an f4 mcu (dwc2 driver), but with the current chip shortage, that is not possible. If there is no plan to make isochronous mode work, can there be a note added to these devices so that examples with unimplemented features are not built?

How to reproduce ?

Just run any of the examples which use isochronous transfers on the stm32_fsdev driver. Easiest board to test with might be a blue pill or similar. nucleo g4 boards with a USB shield also work. In my setup I'm using a X-NUCLEO-USBPDM1 TypeC power delivery shield which also connects the data lines to the usb port.

Debug Log

log_iso.txt

Screenshots

No response

hathach commented 2 years ago

Is there any interest in implementing the isochronous mode in this driver?

Yeah, there is the interest but just not enough time and people to implement it.

pigrew commented 2 years ago

In order to implement this, we'd also want to implement DMA in order to support the higher data rates. To perform DMA, we'd want some sort of abstraction layer in the case that DMA is used in other parts of the application code. Also, DMA will be slightly different on the various variants of the STM32, so I'd expect adding DMA support would be a large undertaking.

If you don't think that DMA is necessary, this could be done without too much effort, though I personally don't need ISO endpoints, so I don't plan to write it myself.

HiFiPhile commented 2 years ago

To perform DMA, we'd want some sort of abstraction layer in the case that DMA is used in other parts of the application code.

That's something I'm also want to be implemented.

But I don't think stm32_fsdev has DMA support ?

PanRe commented 2 years ago

For DMA support, we would need the dcd_edpt_xfer_fifo() to get working. I once pushed some untested code implementing that feature. Curiously, right now all of that code is commented out but present in the main branch. All parts are marked with _// TODO support dcd_edpt_xferfifo API. I was not able to test the code because i don't have any hardware. Directly copying data into the USB hardware buffer by use of DMA is not possible also not in the synopsis case. The trick is to copy the data into the FIFO by use of two DMAs (ringbuffer -> linear and wrapped part) and than schedule dcd_edpt_xfer_fifo(). The process of copying the data from the fifo into the hardware buffer is taken care of by my untested code. This way, the DMA stuff is left to the user. Works the same way as for synopsis.

Iktek commented 2 years ago

Hi guys, I Also stumbled over missing isoc support on fsdev. For my current "fist test" I would be fine if there's no dma-support for now, could we split this issue or is it mandatory to support dma in order to get isoc transfers running? I'm willing to help with implementation and test but maybe I'd need someone to discuss my changes as I'm not that deep into fsdev driver and tinyusb stack up to now. I'll first try to comment in that untested stuff and to adopt the changed routines. Is there somebody willing to discuss the "missing bits" in order to get that working?

I try to implement that here: https://github.com/Iktek/tinyusb/pull/new/1249-feature-stm32_fsdev_isocronous_support

With first tests I got my linux kernel acceping some isoc-transfers but for some reason the transfer stops after a while not kowing what is wrong from within the device:

wireshark_stm32_fsdev_iso_out.txt

HiFiPhile commented 2 years ago

Hi @Iktek , I believe DMA is not supported on fsdev after reading reference manual and ST driver, I could be wrong.

leptun commented 2 years ago

@HiFiPhile I think that is correct. The stm32f103rc for example doesn't have any way to connect the DMA to the USB. It also doesn't have an internal dma controller. I also checked the stm32g474re and it's the same in this sense. That one has a much more sophisticated DMA controller, but even that one can't connect to the USB (although I noticed in one of the diagrams there is some extra FIFO, but I didn't read more into that).

As far as I know, there is no stm32 that actually can talk to the DMA1 or DMA2. In all cases where a DMA is involved, the usb block incorporates an extra DMA just for the usb. Also, this internal DMA always appears for USB_OTG_HS (dwc2) interfaces. It rarely appears on USB_OTG_FS (still dwc2, take the stm32h743zi as a counterexample for this). The USB interface is the one that needs the fsdev driver, but that one seems to never incorporate an internal DMA.

Here is an example from the stm32f439zi document where only the USB_OTG_HS gets the DMA:

And here it is for the stm32f103rc, no dma:

And here for stm32g474re, no dma:

Iktek commented 2 years ago

@HiFiPhile @leptun : Mhh. You may be right, but it could also be that m2m dma is still possible as the pma memory can be accessed from within the normal adressing-space. So for my understanding there only needs to be a dma-connection if requests of an periperal need to be made which would not be the case for pma. Nevertheless there should also be a way to implement isocronous support without dma. I'll first try to also add the ff-api support to the driver as already proposed by the "commented-out" code. However maybe one could have a look in my wireshark-dissections if one has an idea why I see that -EXDEV erros and why the linux-host stops the transfer after a short while. Also with dynamic_debug enabled in the linux-kernel I don't see a cause. I'm also wondering why there are 2 iso-descriptors per out-transfer as one 64byte packet would be sufficient for the configured audio-stream (16khz 16bit stereo). Thanks in advance Pascal

Iktek commented 2 years ago

One step back: there seems to be an issue with my usb-host driver -> with another notebook the stream stays alive.

So what I can now see is incoming audio data. Unfortunately what I realize is that only every 2nd tud_audio_read returns 64 byte of data. I can also see in wireshark, that it seems that the host sends out-packets on EP1 only every 2 ms with 2 iso-out descriptors in it which has 64 bytes each. As the frequency of the received audio data is correct It seems that half of the audio data is missing for some reason. The same upper-layer architecture (descriptors, audio-read scheduling, etc...) work on an stm32L4 with dwc2. So it seems that there is still an issue in the dcd_stm32_fsdev. Does s.o. have an idea here? Could that be related to missing double-buffering support or sth.else?

HiFiPhile commented 2 years ago

It seems that half of the audio data is missing for some reason.

Interesting. If you search there is a similar issue for RP2040.

I think you are the first one trying UAC with fsdev. Could you post your descriptors ?

Iktek commented 2 years ago

Sure:

define AUDIO_CTRL_ID_SPK_INPUT_STREAM 0x01

define AUDIO_CTRL_ID_SPK_FUNIT 0x02

define AUDIO_CTRL_ID_SPK_OUTPUT 0x03

define AUDIO_CTRL_ID_MIC_INPUT 0x04

define AUDIO_CTRL_ID_MIC_FUNIT 0x05

define AUDIO_CTRL_ID_MIC_OUTPUT_STREAM 0x06

define AUDIO_CTRL_ID_CLOCK 0x41

define AUDIO_CTRL_ID_CLOCKSEL 0x40

define AUDIO_NUM_INTERFACES 0x03

define AUDIO_NUM_INCHANNELS 0x02

define AUDIO_NUM_OUTCHANNELS 0x02

define TUD_AUDIO_CTRL_TOTAL_LEN (\

        TUD_AUDIO_DESC_CLK_SRC_LEN+\
        TUD_AUDIO_DESC_INPUT_TERM_LEN+\
        TUD_AUDIO_DESC_OUTPUT_TERM_LEN+\
        TUD_AUDIO_DESC_INPUT_TERM_LEN+\
        TUD_AUDIO_DESC_OUTPUT_TERM_LEN)

define TUD_AUDIO_IO_DESC_LEN (\

TUD_AUDIO_DESC_IAD_LEN\

TUD_AUDIO_DESC_STD_AC_LEN\
TUD_AUDIO_DESC_CS_AC_LEN\
TUD_AUDIO_CTRL_TOTAL_LEN\ / Speaker Interface /\
TUD_AUDIO_DESC_STD_AS_INT_LEN\
TUD_AUDIO_DESC_STD_AS_INT_LEN\
TUD_AUDIO_DESC_CS_AS_INT_LEN\
TUD_AUDIO_DESC_TYPE_I_FORMAT_LEN\
TUD_AUDIO_DESC_STD_AS_ISO_EP_LEN\
TUD_AUDIO_DESC_CS_AS_ISO_EP_LEN\
TUD_AUDIO_DESC_STD_AS_ISO_EP_LEN\ / Microphone Interface / \
TUD_AUDIO_DESC_STD_AS_INT_LEN\
TUD_AUDIO_DESC_STD_AS_INT_LEN\
TUD_AUDIO_DESC_CS_AS_INT_LEN\
TUD_AUDIO_DESC_TYPE_I_FORMAT_LEN\
TUD_AUDIO_DESC_STD_AS_ISO_EP_LEN\
TUD_AUDIO_DESC_CS_AS_ISO_EP_LEN)

define TUD_AUDIO_MIC_ONE_CH_DESC_N_AS_INT 1 // Number of AS interfaces

define TUD_AUDIO_IO_DESCRIPTOR(_itfnum, _stridx, _nBytesPerSample, _nBitsUsedPerSample, _epin, _epinsize, _epout, _epoutsize, _epfb) \

/ Standard Interface Association Descriptor (IAD) /\ TUD_AUDIO_DESC_IAD(_itfnum, AUDIO_NUM_INTERFACES, /_stridx/ 0x00),\ / Audio Control Interface / \ / Standard AC Interface Descriptor(4.7.1) /\ TUD_AUDIO_DESC_STD_AC(/_itfnum/ _itfnum, /_nEPs/ 0x00, /_stridx/ _stridx),\ / Class-Specific AC Interface Header Descriptor(4.7.2) /\ TUD_AUDIO_DESC_CS_AC(/_bcdADC/ 0x0200, AUDIO_FUNC_CONVERTER, TUD_AUDIO_CTRL_TOTAL_LEN, /_ctrl/ AUDIO_CS_AS_INTERFACE_CTRL_LATENCY_POS),\ TUD_AUDIO_DESC_CLK_SRC(AUDIO_CTRL_ID_CLOCK, AUDIO_CLOCK_SOURCE_ATT_INT_FIX_CLK, (AUDIO_CTRL_R << AUDIO_CLOCK_SOURCE_CTRL_CLK_FRQ_POS), /_assocTerm/ 0x01, /_stridx/ 0x00),\ / Speaker Terminals / \ TUD_AUDIO_DESC_INPUT_TERM(AUDIO_CTRL_ID_SPK_INPUT_STREAM, AUDIO_TERM_TYPE_USB_STREAMING, AUDIO_CTRL_ID_MIC_OUTPUT_STREAM, AUDIO_CTRL_ID_CLOCK, AUDIO_NUM_INCHANNELS, AUDIO_CHANNEL_CONFIG_FRONT_LEFT | AUDIO_CHANNEL_CONFIG_FRONT_RIGHT, /_idxchannelnames/ 0x00, /_ctrl/ AUDIO_CTRL_R << AUDIO_IN_TERM_CTRL_CONNECTOR_POS, /_stridx/ 0x00),\ TUD_AUDIO_DESC_OUTPUT_TERM(AUDIO_CTRL_ID_SPK_OUTPUT, AUDIO_TERM_TYPE_OUT_GENERIC_SPEAKER, AUDIO_CTRL_ID_SPK_INPUT_STREAM, AUDIO_CTRL_ID_SPK_INPUT_STREAM, AUDIO_CTRL_ID_CLOCK, /_ctrl/ 0x0000, /_stridx/ 0x00),\ / Microphone Terminals / \ TUD_AUDIO_DESC_INPUT_TERM(AUDIO_CTRL_ID_MIC_INPUT, AUDIO_TERM_TYPE_IN_GENERIC_MIC, AUDIO_CTRL_ID_MIC_OUTPUT_STREAM, AUDIO_CTRL_ID_CLOCK, AUDIO_NUM_INCHANNELS, AUDIO_CHANNEL_CONFIG_FRONT_LEFT | AUDIO_CHANNEL_CONFIG_FRONT_RIGHT, /_idxchannelnames/ 0x00, /_ctrl/ AUDIO_CTRL_R << AUDIO_IN_TERM_CTRL_CONNECTOR_POS, /_stridx/ 0x00),\ TUD_AUDIO_DESC_OUTPUT_TERM(AUDIO_CTRL_ID_MIC_OUTPUT_STREAM, AUDIO_TERM_TYPE_USB_STREAMING, AUDIO_CTRL_ID_MIC_INPUT, AUDIO_CTRL_ID_MIC_INPUT, AUDIO_CTRL_ID_CLOCK, /_ctrl/ 0x0000, /_stridx/ 0x00),\ \ / Speaker Interface / \ / Interface 1, Alternate 0 - default alternate setting with 0 bandwidth /\ TUD_AUDIO_DESC_STD_AS_INT(/_itfnum/ (uint8_t)((_itfnum) + 1), /_altset/ 0x00, /_nEPs/ 0x00, /_stridx/ 0x00),\ / Interface 1, Alternate 1 - alternate interface for data streaming /\ TUD_AUDIO_DESC_STD_AS_INT(/_itfnum/ (uint8_t)((_itfnum) + 1), /_altset/ 0x01, /_nEPs/ 0x02, /_stridx/ 0x00),\ / Class-Specific AS Interface Descriptor(4.9.2) /\ TUD_AUDIO_DESC_CS_AS_INT(AUDIO_CTRL_ID_SPK_INPUT_STREAM, AUDIO_CTRL_NONE, AUDIO_FORMAT_TYPE_I, AUDIO_DATA_FORMAT_TYPE_I_PCM, AUDIO_NUM_OUTCHANNELS, AUDIO_CHANNEL_CONFIG_FRONT_LEFT | AUDIO_CHANNEL_CONFIG_FRONT_RIGHT, /_stridx/ 0x00),\ / Type I Format Type Descriptor(2.3.1.6 - Audio Formats) /\ TUD_AUDIO_DESC_TYPE_I_FORMAT(_nBytesPerSample, _nBitsUsedPerSample),\ / Standard AS Isochronous Audio Data Endpoint Descriptor(4.10.1.1) /\ TUD_AUDIO_DESC_STD_AS_ISO_EP(_epout,(TUSB_XFER_ISOCHRONOUS | TUSB_ISO_EP_ATT_ASYNCHRONOUS | TUSB_ISO_EP_ATT_DATA), _epoutsize, TUD_OPT_HIGH_SPEED ? 0x04 : 0x01),\ / Class-Specific AS Isochronous Audio Data Endpoint Descriptor(4.10.1.2) /\ TUD_AUDIO_DESC_CS_AS_ISO_EP(AUDIO_CS_AS_ISO_DATA_EP_ATT_NON_MAX_PACKETS_OK, AUDIO_CTRL_NONE, AUDIO_CS_AS_ISO_DATA_EP_LOCK_DELAY_UNIT_UNDEFINED, 0x0000),\ / Standard AS Isochronous Feedback Endpoint Descriptor(4.10.2.1) /\ TUD_AUDIO_DESC_STD_AS_ISO_FB_EP(_epfb, 1),\ \ / Microphone Interface / \ / Standard AS Interface Descriptor(4.9.1) /\ / Interface 1, Alternate 0 - default alternate setting with 0 bandwidth /\ TUD_AUDIO_DESC_STD_AS_INT((uint8_t)((_itfnum)+2), /_altset/ 0x00, /_nEPs/ 0x00, /_stridx/ 0x00),\ / Standard AS Interface Descriptor(4.9.1) /\ / Interface 1, Alternate 1 - alternate interface for data streaming /\ TUD_AUDIO_DESC_STD_AS_INT(/_itfnum/ (uint8_t)((_itfnum)+2), /_altset/ 0x01, /_nEPs/ 0x01, /_stridx/ 0x00),\ / Class-Specific AS Interface Descriptor(4.9.2) /\ TUD_AUDIO_DESC_CS_AS_INT(AUDIO_CTRL_ID_MIC_OUTPUT_STREAM, AUDIO_CTRL_NONE, AUDIO_FORMAT_TYPE_I, AUDIO_DATA_FORMAT_TYPE_I_PCM, AUDIO_NUM_INCHANNELS, AUDIO_CHANNEL_CONFIG_FRONT_LEFT | AUDIO_CHANNEL_CONFIG_FRONT_RIGHT, /_stridx/ 0x00),\ / Type I Format Type Descriptor(2.3.1.6 - Audio Formats) /\ TUD_AUDIO_DESC_TYPE_I_FORMAT(_nBytesPerSample, _nBitsUsedPerSample),\ / Standard AS Isochronous Audio Data Endpoint Descriptor(4.10.1.1) /\ TUD_AUDIO_DESC_STD_AS_ISO_EP(_epin, (TUSB_XFER_ISOCHRONOUS | TUSB_ISO_EP_ATT_ASYNCHRONOUS | TUSB_ISO_EP_ATT_DATA), _epinsize, TUD_OPT_HIGH_SPEED ? 0x04 : 0x01),\ / Class-Specific AS Isochronous Audio Data Endpoint Descriptor(4.10.1.2) /\ TUD_AUDIO_DESC_CS_AS_ISO_EP(AUDIO_CS_AS_ISO_DATA_EP_ATT_NON_MAX_PACKETS_OK, AUDIO_CTRL_NONE, AUDIO_CS_AS_ISO_DATA_EP_LOCK_DELAY_UNIT_UNDEFINED, 0x0000)

Iktek commented 2 years ago

Same descriptor is working properly with stm32l4 + dwc2 I'm wondering if the "doublebuffering" is needed in order to support isoc-audio as it seems that maybe every 2nd pice of data is not handled. I'm also wondering if it's supported by the fsdev that there are 2 isoc-descriptors with 64 byte in each urb. Maybe one of them is not handled properly. As already stated you'll find my actual state in adopting the fsdev driver in: https://github.com/Iktek/tinyusb/pull/new/1249-feature-stm32_fsdev_isocronous_support I just pushed the current state.

HiFiPhile commented 2 years ago

I'm wondering if the "doublebuffering" is needed in order to support isoc-audio as it seems that maybe every 2nd pice of data is not handled.

I don't think it's the cause. As there is no ack or retry for ISO transfer, host will continue to send even device can't handle it properly. I remember Windows continued to send even if I paused the execution of LPC4337. In your case host is sending only half of the data.

Seems like feature unit is missing in descriptors.

Iktek commented 2 years ago

Mhh.. where do you see that the host is missing to send the data? In the dissection I see that the host is sending about 128bytes every 2 ms which is the correct data amount. I would suggest that the device can't handle every 2nd packet (or e.g the 2nd descriptor in the iso-out-urb) As the same descriptors are working properly with dwc2 I would not search the bug in the descriptor or feature-unit.

HiFiPhile commented 2 years ago

Mhh.. where do you see that the host is missing to send the data?

I'm sorry I misread your Wireshark log 😅 I'll take a deeper look after vacation.

pigrew commented 2 years ago

Hi @Iktek , I believe DMA is not supported on fsdev after reading reference manual and ST driver, I could be wrong.

The peripheral cannot make DMA requests, but the operating system/tinyusb can periodically initiate "MEM2MEM" transfers as needed (preferably using the double-buffer mode of the USB peripheral). Handling DMA complete interrupts manually shouldn't be so hard to do. Error handling (i.e., when USB disconnects happen) could be complicated if a DMA transfer is in progress in the background.

Currently, the stm32_fsdev driver copies data into the peripheral buffer during interrupts. Perhaps a simple optimization to speed things up a bit would be to synchronously perform a DMA MEM2MEM transfer (i.e., start the transfer and busy-loop until the transfer finishes) instead of the memcpy. We maybe would want to define an OS API to request/reserve a DMA channel for the exclusive use of tinyusb and to enable/disable the DMA controller. Do we have any sort of OS DMA API that's used for other architectures?

Having the operating system manage a queue of DMA requests in the background seems doable and probably would give the best speed, but I'm not sure if that's needed? Is the current (slow) implementation limiting performance at the moment?

pigrew commented 2 years ago

Same descriptor is working properly with stm32l4 + dwc2 I'm wondering if the "doublebuffering" is needed in order to support isoc-audio as it seems that maybe every 2nd pice of data is not handled.

I think stm32_fsdev must use double-buffering, based on the reference manual section 23.4.4 isochronous transfers:

Isochronous endpoints implement double-buffering to ease application software development, using both ‘transmission’ and ‘reception’ packet memory areas to manage buffer swapping on each successful transaction in order to have always a complete buffer to be used by the application, while the USB peripheral fills the other.

Based on the EP kind register table (table 175), EP_KIND is ignored for ISO endpoints... it's always in double-buffered mode.

Iktek commented 2 years ago

The peripheral cannot make DMA requests, but the operating system/tinyusb can periodically initiate "MEM2MEM" transfers as needed I agree.

I think a busy-loop will not be a solution which would speed up much as irq is still blocked during memcpy. I don't expect the dma-transfer being much faster then the manual memcpy. I'ts only worth being used if we could make it asyncronous.

I'll have a look these days if double-buffering is doable easy. I think dma is not mandatory for now but would maybe ease up things. If DMA would not help in Design, I would postpone it and try to implement double-buffering with copying in user-space first as this seems to be the crux.

As I think I have to dig in more I'll come up with a proposal here these days.

Iktek commented 2 years ago

Ok, i digged in a bit and things seem even more complicated as thought before: Double-buffered isoch-endpoints can't be bidirectional. So one would have to use 2 endpoint indices here. As the driver states endpoint id and ix are mixed up all-over-the-world, so it would be hard to get them divided again. Unfortunately UAC2 needs the sync endpoint being on the same address ( 0x01 <->0x81 ) as the corresponding data-endpoint.. so we have a no-go here.

In addition my customer phone me he would want to pause the current STM32G4 project (as the lead time for this controller is currently to high), which means that I currently can't spent hours any more on pushing this further :-(, I would have liked to but getting all that stuff in would be way to much for now. Hopefully I can resume that project on a given point in time and hopefully I can remember what needs to be done. If someone else would want to push this further here are some steps I think which need to be done to get isoch-transfers in:

separate endpoint-ix and endpoint-id in order to allow bidirectional, double-buffered endpoints
Introduce some API or automated way to allocate endpoint-ix without binding to endpoint-addr (e.g. use open_ep_count as ix on alloc ??)
allocate a twice-the-maxPktSize pma buffer for double-buffered endpoints and set the addresses accordingly ( e.G. TX, which corresponds to RX0 or TX0 to &buffer, RX, which corresponds to RX1 or TX1 to &buffer[maxPktSize] )
implement the dtog-bit-handling
move the data-copying and buffer-swapping to thread-context

So if nobody else has time to work on this I'll hopefully continue soon and now have a list to remember what to do.. I'm also happy about every input / ideas which would make things easier. Maybe I also got something wrong...

Regards Pascal

rolandvs commented 2 years ago

@Iktek Is there any change you or perhaps others @hathatch are/have been working on getting this to work or made any progress since the last post?

pigrew commented 2 years ago

@rolandvs, It's not really on my list of things to do, since I'm not that interested in media streaming. I don't think anyone else has worked on it.

I have, however, been thinking about the implementation of double-buffering, and using FIFO for data transfer, and ISO should be simple after double-buffering is implemented. I'd like @hathach's input on how endpoint configuration could be implemented: How should an application request that an endpoint be double-buffered or not, and also allow an application to specify a buffer's location in the PMA. Do we need to add a application callback to get parameters? This would vary across architectures, so could make the code difficult.... Should we default to single-buffered, except for ISO which would be double-buffered?

I'm imagining perhaps:

struct stm32_fsdev_ep_cfg {
bool double_buffered;
uint32_t pma_start; // -1 for automatic allocation. 
};

int tud_get_ep_cfg(int rhport, int config, int interface, int ep, struct stm32_fsdev_ep_cfg *cfg);

Other architectures would have different parameters in the configuration struct.

rolandvs commented 2 years ago

Thanks for your quick reply! Ok, we have a need to get this running on an G491. Is it OK to drop a few questions as you have pursued this subject before?

pigrew commented 2 years ago

Thanks for your quick reply! Ok, we have a need to get this running on an G491. Is it OK to drop a few questions as you have pursued this subject before?

Sure (I wrote the stm32fsdev driver here...), as long as they are related to ISO... otherwise, we could open a separate discussion.

Iktek commented 2 years ago

@Iktek Is there any change you or perhaps others @hathatch are/have been working on getting this to work or made any progress since the last post?

Hi @rolandvs , there have been some "experiments" in https://github.com/Iktek/tinyusb/tree/1249-feature-stm32_fsdev_isocronous_support where I enabled ISO without doublebuffering, which did not work out, so: nothing worthy I think. Unfortunately project priorities shifted and other things got more urgent for now (thanks to the chip-crisis).. So basically I'm also still interested getting things running (now escpecially on STM32L151), but unfortunately I have no chance to invest much time here now. Nevertheless I'm willing to do some testing on other chips / situations / descriptors or give some discussion-input / code-reviewing if this could be helpful.

@pigrew : As there seem to be (only) a few people which seem to need this feature I would vote that we do not invest tooo much time in API-questions for now, but first rework the driver-structure in a way to allow doublebuffering by design (e.g. separate endpoint-ix and endpoint-id a.s.o.) and get it working (maybe just for iso-xfers for now as they do generally not work without). We should continue discussing how to allow the user to enable/disable DB for non-iso-transfers but this is IMHO not the root problem for now.

On the other Hand having a PMA-location would be needed. In other drivers (like dcw2) the allocation of the fifos is done "automatically" on endpoint-creation/deletion with a few limitations (e.g. closed endpoint's fifos are not freed, but re-used on re-open). Maybe we could also follow this example to create PMA-allocations.

As you already said that streaming is not your on your list to do, maybe you could just give some hint's how you would get that running / do organization of Id's etc. Are you on the same page then I that the mentioned reworking is needed or do you have other ideas here? Hopefully s.o. will get time to do the implementation then ( If nobody is found, I'll hopefully get more time at the end of the year).

Expecting your inputs Best Regards Pascal

pigrew commented 2 years ago

The endpoint-ix vs endpoint-ix separation isn't necessarily needed if the application doesn't have a second endpoint with that particular ID (e.g., if the ISO EP is 3OUT, then there couldn't be a 3IN endpoint).

The driver already has an allocation method for the PMA. It's correct for most cases (where PMA is not also used for CAN).

The minimum changes need:

Modify allocator to support double-buffering.
~Support longer PMA buffers (for packet lengths up to 512 bytes?), including changes to allocator~
Update xfer callbacks/interrupt handler for double-buffering

-- Some other notes:

Data toggle should always be set to DATA0 when transmitting. (peripheral handles this internally)
Peripheral uses the data toggle bit in the control register to specify which buffer to use when operating double-buffered.
ISO endpoint never send a handshake response (no NAK or ACK response, ever).

There are a few things that I don't know how to handle:

Should lost packets be detected? What's the mechanism for error handling? The FSDEV peripheral has a USBFNR.FN frame number register which could be used to detect if a packet was skipped.
During double-buffering, should the xfer_complete callback be called when the data has been transmitted, or when there is at least one free buffer?
Does the FIFO interface need to be implemented? For the FIFO, how does one transmit a short or zero-length packet? (needed if FIFO were used for other protocols.). I'm more familiar with the TI USBD API, where an endpoint can be setup as a FIFO or using block buffers, but one cannot switch between them after the endpoint is opened.

rolandvs commented 2 years ago

Thanks for all your comments. I'll see if I can make an effort, but unfortunately it will have to wait as I'm currently working on some issues with higher priority.

pigrew commented 2 years ago

Thanks for all your comments. I'll see if I can make an effort, but unfortunately it will have to wait as I'm currently working on some issues with higher priority.

I started looking at the driver, too, and have found at least one bug (#1583), and got confused about error handling while trying to fix it (#1584 ). I'm reworking the code with a goal of handling double-buffering and searching for race conditions. I'll submit it as a PR once I have it working better than the current code. It's entirely possible that ISO transfers will work with the block-buffer interface once I make these changes.

Iktek commented 2 years ago

The endpoint-ix vs endpoint-ix separation isn't necessarily needed if the application doesn't have a second endpoint with that particular ID (e.g., if the ISO EP is 3OUT, then there couldn't be a 3IN endpoint).

Hi pigrew: Unfortunately the UAC2-Specification (Audio-Streaming) is expecting exactly that. An explicit sync endpoint needs necessarily have the same endpoint-ix as its corresponding stream endpoint. I wonder if there are also other classes which expect this to be possible. So for my opinion the discussed separation is mandatory. Best Regards Pascal

pigrew commented 2 years ago

I've started working on rewriting the fsdev driver, to add the layer of indirection between endpoint addresses and hardware endpoints, plus double-buffering... and finding tons of bugs.

Some bugs include:

Having different EP types with the same address was broken
Receiving odd numbers of bytes potentially causes memory overruns
Driver was broken if one attempts to open a control EP with address != 0 (changing to just error out if attempted)
Add check for CMSIS >=5.1
Broken with EP >=256 bytes

The changes are about 30% complete... lots are left to do. Hopefully the reworks won't bloat code size and memory usage too much.

My code is currently broken, but here it is if your curious:

https://github.com/pigrew/tinyusb/tree/stm32_doublebuf

Iktek commented 2 years ago

Hi @pigrew , thank you for taking the job pushing this further. Is there anything I could do for you? (testing / take over detail implementations) Or is there anything where you document your "missing bits" so I could do some groundwork? Regards Pascal

skuep commented 2 years ago

Yeah, thanks from me as well! I am currently trying to put the UAC2 audio class into a stm32. I have built a lot of systems with USB interfaces and getting the CDC class to work with the internal UART was exceptionally easy which is what left me really impressed by tinyusb. This evening I tried the UAC2 stuff and I was very sad, when I found out, that it does not work at all. I haven't done a lot with STM32's USB stuff, but if there is anything I could do to help, let me know.

skuep commented 2 years ago

Hi guys, I had a look at the work from @Iktek and I could solve the issue with every odd buffer data being 0. The reason for this was as already suspected, that the second buffer in the double-buffer system was never written to.

With these changes I got the audio_text example working without buffer dropouts: https://github.com/hathach/tinyusb/compare/master...skuep:tinyusb:0.14.0-isofix

Note that I currently implemented this for some STM32F3xx series only. If you have another compatible series, we should add the ISOCHRONOUS_DOUBLEBUFFER definition for it in dcd_stm32_fsdev_pvt_st.h.

PanRe commented 2 years ago

Very nice! Thanks for pushing this forward! @skuep I saw that there is a small piece of untested code in your fork. Any chance you can test and finalize it?

skuep commented 2 years ago

Hi PanRe, The untested portion is from @Iktek. I have not yet concerned myself with it, sorry. I am still working on verifying whether the opposite direction (SPEAKER instead of MICROPHONE) works with my code.

Iktek commented 2 years ago

Hi guys, I did not continue with my approach as I was waiting for a good state of @pigrew 's work. Because: my first approach still has the problem with different endpoint directions having the same address and this had to be tidied up first. This is mandatory because of UAC2 sync endpoint needing to have the same address as the corresponding data endpoint. Unfortunately I don't have much time to dig in the current changes of @pigrew, so I need to wait for some "stable" state of the code with commented missing bits to continue work here.. or maybe some of you will bring the bits togehter :-).. Regards Pascal

skuep commented 2 years ago

Hi Iktek, Are you sure that the endpoint address requirement is mandatory? I have had a quick look in Release 1.0 of the UAC specification (here https://www.usb.org/sites/default/files/audio10.pdf) and on page 61 in the "Standard AS Isochronous Audio Data Endpoint Descriptor" Table, there is at Offset 8 a field for bSynchAddressto define the synchronization endpoint address.

Can't we use that? Although that is UAC1 I think, and not UAC2... It seems this was removed in UAC2

Iktek commented 2 years ago

Hi @skuep, yes you're mixing up UAC and UAC2. I also hat tests in implementing UAC class but the UAC-driver on windows does not really allow syncing. So in order to get a valid synced audio you would need to implement UAC2. And UAC2 has removed the bSyncAddress field. From UAC2 on the sync-address needs to sit on the same index as the corresponding data endpoint. So if you don't want to implement an ASRC on your device, you'd either need to have a sync-endpoint or you need to live with frame-drops. (One could also implement an tricky jitterbuffer which adds or removes just one sample) but with a sine wave this would be hearable. Regards Pascal

skuep commented 2 years ago

Hm, very dissatisfying, but thanks for the information. I would have gone on to work on using UAC1, since that is actually enough in my case. No way to get asynchronous clocks to work on a Windows machine? EDIT: Yeah, sadly this pretty much confirms your statement: https://www.mikrocontroller.net/topic/470707#5772972 Simon

skuep commented 2 years ago

Hi again, @Iktek, Can you give me a link to the @pigrew tinyusb fork? Is he currently working on this feature? I just had some spare time and I have implemented a rather crude way of handling this issue. See this commit https://github.com/hathach/tinyusb/commit/af46a53bdfaf925332bbff4446bc2196972b2c69 By default, the tinyusb stack works just as normal, but I added an application callback, that you can use to alter the "mapping" of the endpoint address to the internal endpoint registers. From my application:

/* Endpoints */
#define EPNUM_AUDIO_IN      0x81
#define EPNUM_AUDIO_OUT     0x02
#define EPNUM_AUDIO_FB      0x82
    ....

// We have ISOCHRONOUS endpoints defined that share the same endpoint number, but have opposing directions.
// However with STM32 hardware, ISOCHRONOUS endpoints use both RX and TX structures of the same endpoint register in hardware
// We circumvent a clash by defining our own custom endpoint map for the tiny usb stack
uint8_t tu_stm32_edpt_number_cb(uint8_t addr)
{
    switch (addr) {
    case 0x00:
    case 0x80:
        /* Endpoint Zero */
        return 0x00;

    case EPNUM_AUDIO_IN:
        return 0x01;

    case EPNUM_AUDIO_OUT:
        return 0x02;

    case EPNUM_AUDIO_FB:
        return 0x03;

    default:
        TU_BREAKPOINT();
        return 0x00;
    }
}

From my limited testing of this functionality, this seems to work quite well, I can just shift around the endpoint registers willy-nilly by changing the return values in the tu_stm32_edpt_number_cbcallback. However I have not yet tested the actual speaker feature.

Note: The commit is on a new branch. I made changes to the one given above, regarding the support of ISOCHRONOUS endpoints and the double-buffer feature. I quickly ran out of the 512 byte PMA ram on my STM32F302 and since the double buffers are not actually used in a double-buffer fashion anyway by the tinyusb stack, I just set allocated a single buffer (just as before all these changes) and used the same memory for both double-buffer locations.

pigrew commented 2 years ago

Hi again, @Iktek, Can you give me a link to the @pigrew tinyusb fork? Is he currently working on this feature?

I got sidetracked, and didn't get nearly as far as I had hoped.

The tricky part is handling the allocations of the PMA for double-buffered endpoints (and I found that the existing code had some rather substantial bugs). I was rewriting the PMA allocation code and endpoint housekeeping to handle double-buffering.

The link is https://github.com/pigrew/tinyusb/tree/stm32_doublebuf (and you can see that I've not worked on it in a few months....

skuep commented 2 years ago

@pigrew: Thanks for the heads-up. I will dig through your changes and see what's what :-) What issues with the PMA for double buffered endpoints did you have? Since tinyusb does not seem to implement any double-buffering features (for now?) I basically now just left everything as-is with PMA allocation and just set both (TX and RX) (double-) buffer locations to the same buffer.

EDIT: Had a look at your code. Yeah, you are totally right with the PMA allocation bug. I am going to steal that if you don't mind :-)

pigrew commented 2 years ago

@pigrew: Thanks for the heads-up. I will dig through your changes and see what's what :-) What issues with the PMA for double buffered endpoints did you have? Since tinyusb does not seem to implement any double-buffering features (for now?) I basically now just left everything as-is with PMA allocation and just set both (TX and RX) (double-) buffer locations to the same buffer.

EDIT: Had a look at your code. Yeah, you are totally right with the PMA allocation bug. I am going to steal that if you don't mind :-)

You are correct, no double-buffered in the existing code, but other bugs did exist (I think it was that the TX and RX on a given endpoint were assumed to be of the same type...).

There was a use case where there must be a TX and RX endpoint for a given USB logical endpoint number (required by audio, IIRC? I think it was mentioned above). We need to allocate two double-buffered physical endpoint addresses (one for TX and one for RX), to properly support this, creating an extra layer of indirection.

During my implementation, I kept getting confused about concurrency issues: mostly if we had to set various things as volatile since they were used both inside and outside interrupts, and also if I should be using C atomic types. (or maybe the RTOS abstraction layer).

skuep commented 2 years ago

Yeah, this is exactly the problem we are running into here. I am currently still using the method of custom mapping the endpoint address to the hardware endpoint using a callback function. However I have made some progress in my fork:

I added your fix regarding the buffer size during allocation to be aligned to the allowable hardware sizes
I added what you call "indirection" into the code where it was necessary. I re-used the already present xfer_ctl_tstructure

https://github.com/hathach/tinyusb/compare/master...skuep:0.14.0-sk

So now we could theoretically go ahead and implement a dcd_edpt_alloc function that is to be called instead of tu_edpt_number in dcd_edpt_open for finding the next free endpoint to use. For now I will work on different parts of my project, since I am currently satisfied using the custom mapping using the callback function (as explained above). EDIT: I noticed that there is still a problem in the isochronous receive and transmit functions which results in every second buffer being zero length. I will look into it this evening.

skuep commented 1 year ago

Hi guys, so I got UAC2 speaker and microphone working on an STM32F302 (thanks to @Iktek for the descriptor example above). See my tinyusb branch here: https://github.com/hathach/tinyusb/compare/master...skuep:0.14.0-sk

Here is the device I have implemented. Note that currently, although feedback generation is working, it is lacking the long-term buffer drift prevention Feature required by the USB spec (it is not implemented in tinyusb). So when playing an audio file for half an hour, buffer level creeps dangerously low. https://github.com/skuep/AIOC

HiFiPhile commented 1 year ago

it is lacking the long-term buffer drift prevention Feature required by the USB spec (it is not implemented in tinyusb).

I've implemented a feedback method by keeping fifo at a stable level. https://github.com/hathach/tinyusb/pull/1463

Maybe it can help you.

skuep commented 1 year ago

Cool thanks! I will have a look at it.

I have been reading and thinking about how to do this and (as other people do) I think that a feedback purely based on FIFO count is not according to specs because you basically implement a P controller on your device (i.e. you use some kind of feedback to keep the FIFO count constant). However, only the host is supposed to have a controller, the device only states the number of samples played. The spec then goes on and says that over the long term, the buffer fill level may drift and only in case this is detected, are you allowed to modify the feedback value.

So I am not sure if the dual controller scenario can result in instability issues in case the parameters are not chosen wisely. However that is just my understanding at the moment. If any of you are the wiser, please let me know

HiFiPhile commented 1 year ago

So I am not sure if the dual controller scenario can result in instability issues in case the parameters are not chosen wisely.

In theory it's possible, so I've limited feedback min/max to a small interval.

I did all tests on HS, on which fifo level is easier to be regulated than FS, result is satisfying amount 44.1KHz to 384kHz.

However, only the host is supposed to have a controller, the device only states the number of samples played.

IMHO is the UAC specification who should be blamed. What it want to achieve is a regulated loop, but in real it's not a closed-loop system as the feedback is only an estimation.

skuep commented 1 year ago

IMHO is the UAC specification who should be blamed. What it want to achieve is a regulated loop, but in real it's not a closed-loop system as the feedback is only an estimation.

I agree. The UAC spec is not really the USB consortium finest work if you ask me. Why didn't they just specify that the device regularly send a monotonically increasing counter (integral) of all samples played up to now (and let it wrap around at 2**32-1). There would be no issue whatsoever with buffer level drift, because the host could directly control the amount of samples send using the amount of samples played. 🤔 Instead they settled in sending the derivative of this counter resulting in integration inaccuracies that then need to be compensated with a "hack"... Anyway

vorosj commented 1 year ago

"Here is the device I have implemented." Hi, It means you succesfully implemented the TinyUSB audio class to the STM32F30x family? It's really good news, because the CubeMX generated audio code is not working on the F103 and F303 controllers, but works of the F4 family. I tried the TinyUSB audio test example a few months ago, same, it worked with the STM32F411, but not with F103 or F303. Maybe your solution is working on the SRTM32F10x family as well, Bluepill etc.? A simple example like an USB audio speaker sample code could help a lot for users (like me) not so familiar with the inner structure of TinyUSB. regards, Jozsef

skuep commented 1 year ago

Jep, It is working so far with a STM32F302 which is, as far as I can tell, largely compatible with the STM32F103, therefore it should work. However it needed a lot of tweaking of the STM32 driver in TinyUSB and also still has some crude solutions, like the static mapping of endpoint addresses to hardware endpoints using a callback function. Thus I use my own TinyUSB fork here: https://github.com/skuep/tinyusb/tree/0.14.0-sk

While the project I have linked above is not a USB audio speaker sample code, it is in fact basically only a soundcard with 1 mono microphone channel and 1 mono speaker channel. Btw, I have since implemented the buffer-creep prevention as dictated by the USB spec.

hathach / tinyusb

stm32_fsdev isochronous support #1249

Operating System

Board

Firmware

What happened ?

How to reproduce ?

Debug Log

Screenshots

define AUDIO_CTRL_ID_SPK_INPUT_STREAM 0x01

define AUDIO_CTRL_ID_SPK_FUNIT 0x02

define AUDIO_CTRL_ID_SPK_OUTPUT 0x03

define AUDIO_CTRL_ID_MIC_INPUT 0x04

define AUDIO_CTRL_ID_MIC_FUNIT 0x05

define AUDIO_CTRL_ID_MIC_OUTPUT_STREAM 0x06

define AUDIO_CTRL_ID_CLOCK 0x41

define AUDIO_CTRL_ID_CLOCKSEL 0x40

define AUDIO_NUM_INTERFACES 0x03

define AUDIO_NUM_INCHANNELS 0x02

define AUDIO_NUM_OUTCHANNELS 0x02

define TUD_AUDIO_CTRL_TOTAL_LEN (\

define TUD_AUDIO_IO_DESC_LEN (\

define TUD_AUDIO_MIC_ONE_CH_DESC_N_AS_INT 1 // Number of AS interfaces

define TUD_AUDIO_IO_DESCRIPTOR(_itfnum, _stridx, _nBytesPerSample, _nBitsUsedPerSample, _epin, _epinsize, _epout, _epoutsize, _epfb) \