xmos / lib_i2s

I2S/TDM digital audio interface library
Other
20 stars 24 forks source link

frame based I2S master bug #26

Closed QuinnWang closed 2 years ago

QuinnWang commented 7 years ago

Problem: sometimes noise, sometimes volume up, sometimes volume down. Just like some bits are shifted.

How to reproduce:

The issue was originally found by a customer:

ed-xmos commented 7 years ago

Hi Quinn, I helped develop I2S frame based. It's very low overhead and so can deal with quite long back-pressure on the callbacks (total time of all callbacks must be less than the frame period) and has been working at 768KHz, so 48KHz should be trivial. However, like all of the I2S implementations, it will only hold synch if the callbacks do not delay the I2S loop beyond the limit. If this happens then LR clock and/or data will be shifted (BCLK is free running). So to progress this bug, we need to be sure that the application does not assert a significant delay causing I2S to break. Can you insert a timing assertion to test this?

Something like this in the restart case:

time_old = time_now; t :> time_now; if (time_now - time_old > 2084) { debug_printf("timing assertion fail in i2s handler%d\n", time_now - time_old ); __builtin_trap(); }

QuinnWang commented 7 years ago

Hi Ed

It is not the callback delay issue. I can understand your concern, because when I met the problem originally, I doubted the delay (or the efficiency of the interface call) as well, so at that time, I tried to:

It is worth noting:

ed-xmos commented 7 years ago

Hi Quinn, Thanks for the great level of detail. That does seem to point to some other error than callback back pressure. Can I ask?

Obviously the first thing we have to do is make it fail locally so we can then find the root cause of the issue.

The main difference between i2s master and frame i2s master is (apart from frequency and timing of callbacks) one uses BCLK divider in the clock block and the other uses a software divider by outputting patterns to p_bclk. In software divider case, we can sometimes see signal integrity problems on the BCLK pin - relfections can cause false clocks. However, this would be worse on the software divider case than clockblock divider case.

Anyhow, please confirm the above 2 points.

Regards, Ed

On 25 Jan 2017, at 15:07, QuinnWang notifications@github.com<mailto:notifications@github.com> wrote:

Hi Ed

It is not the callback delay issue. I can understand your concern, because when I met the problem originally, I doubted the delay (or the efficiency of the interface call) as well, so at that time, I tried to:

It is worth noting:

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/xmos/lib_i2s/issues/26#issuecomment-275132280, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAmLbREs1fb1YWDkrav5Ew7EoMMewA4Cks5rV2VKgaJpZM4Lrw9e.

ed-xmos commented 7 years ago

Hi QUinn, Regarding the "xrun --dump-state” case. Do you mean you can cause a running system to fail by typing that command from the host? If so, this is exactly what we would expect. Just like print over JTAG, the JTAG operation interrupts the entire tile. It will do this to interrogate all of the state and report back to the host. So I would absolutely expect real-time to be broken if you try to dump the state of a running device. This would almost certainly cause alignment issues on I2S. Regards, Ed On 25 Jan 2017, at 15:07, QuinnWang notifications@github.com<mailto:notifications@github.com> wrote:

Hi Ed

It is not the callback delay issue. I can understand your concern, because when I met the problem originally, I doubted the delay (or the efficiency of the interface call) as well, so at that time, I tried to:

It is worth noting:

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/xmos/lib_i2s/issues/26#issuecomment-275132280, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAmLbREs1fb1YWDkrav5Ew7EoMMewA4Cks5rV2VKgaJpZM4Lrw9e.

QuinnWang commented 7 years ago

Hi Ed

  1. xrun xxx.xe, you can hear the normal loopback

  2. then xrun --dump-state xxx.xe, you can hear the noise with AN00162_i2s_loopback_demo_frame_i2s_master.xe; but it is normal with AN00162_i2s_loopback_demo_normal_i2s_master.xe

BTW, now this is not a block issue on customer side, but it is worth finding out the root cause when you have time, since it is indeed an issue for who is using this lib. Thanks for your attention.

Cheers, Quinn

QuinnWang commented 7 years ago

Just for testing

ed-xmos commented 7 years ago

Hi Quinn,

Please see comments below +++

"Though I am not sure if the dump-state is related to the real issue, but if the dump-state can cause the issue, then can see the issue on the real project; and if using the normal i2s, the dump-state can't cause the issue, then on the real project, can't see the issue anymore. So from the simple logic, I suppose they are related."

+++The fact that normal I2S is not broken by dumpstate (interrupting the tile) is that it uses a different clocking scheme. BCLK is generated via a pattern on a buffered port so if the processor stops, BCLK stops. This also means that LRCLK and data stop. The master may be able stop and restart without losing synch relative to data, LRCLK and BCLK. Frame based has a free-running BCLK generated by the clockblock (much more efficient). However, if the processor is interrupted, then the I2S slave will continue to receive the BCLK from the clock block. When it is resumed, there is only a 1/64 chance of alignment being correct. Neither component was designed to be interrupted so the fact one copes with it better than the other is interesting, but not useful. So whilst this causes an issue that looks like the reported problem, it is not re-creatign the problem locally - the root cause for the customer problem is different.

+++We need to look elsewhere to get to the real root cause. This is interesting:

"BTW, seems it is the issue of the master in, master out seems work normally. And in fact, in the real project, the issue will happen with I2S frame master anyway at a low freq without dump-state and JTAG.”

+++To me, this suggests that the slave (CODEC I assume) is receiving an extra BCLK occasionally which will cause it to mis-align. The master/data out does not misalign because those signals are locked internally. My feeling is that this could be a signal integrity problem on the BCLK line - perhaps another signal interfering with BCLK or perhaps reflections on the trace? I think we should focus on this to solve the issue. Have you reviewed the schematic/layout?

Regards, Ed

QuinnWang commented 7 years ago

Hi Ed

Disregarding the dumpstate, the fact on the real customer project, if using the normal I2S, then customer didn't find the random startup noise anymore. That shows:

So the simple logic above should be able to show:

Best Regards Quinn

ed-xmos commented 7 years ago

Hi Quinn, yes, we should completely forget dumpstate - this is misleading and will not bring us to the root cause.

As I explained before, there is a difference in how the port works between the two implementations:

I was wondering if the a poor BCLK signal may cause both XMOS and DAC to get a false edge, but actually stay in synch. This is just an idea at the moment because I cannot think of any other reason for this behaviour.

I am not yet sure of the root cause. To systematically find root cause we have to:

1) Recreate the problem locally 2) Debug the observed issue

Because you cannot reproduce this on an XMOS board, my feeling is that this hardware related. I could be wrong because I do not have a system I can interrogate to verify this. This is just my feeling from many years of writing and debugging I2S systems. I cannot see why after millions of loops, the I2S frame master would suddenly fail (especially as you say there is no application timing delay) from a code perspective. It has been used in a few places also and we have not had reports that it is broken. There could be a bug, but probably a small chance.

Could you get a scope capture from the BCLK pin near the DAC? It would be interesting to see what the egdes look like and perhaps the probe loading may make the problem disappear?

Do you see how, if it is only the input that misaligns occasioanlly, that it is most likely an issue of the BCLK reaching the DAC?

Thanks, Ed

QuinnWang commented 7 years ago

H Ed

As I explained before, there is a difference in how the port works between the two implementations:

Yes, the original issue has not been reproduced on the XMOS demo board yet. So let me try to recreate the problem on the demo board in order to solve 2 strange items:

BTW, here is the original issue info, just FYI: [How to reproduce the issue]

[The firmware audio flow] I2S master in -> DSP -> I2S master out

[The firmwares loading flow in the project] 1\ audio input mode choose Analog in, power off 2\ power on 3\ loader, run default firmware “UAC2 in -> DSP -> I2S master out”; then detect the mode selection is not UAC2 in, is Analog in; do software reboot 4\ loader, run Analog in firmware “I2S master in -> DSP -> I2S master out”

Best Regards Quinn

ed-xmos commented 7 years ago

Hi Quinn, “Sorry, here I can’t understand that, why in i2s normal, BCLK is an output AND an input? Because the hardware is the same, then the roles of the I2S ends are fixed, no matter using what kind of I2S master. “

Yes it does seem odd, but it really does work like that for i2s normal. The clocking scheme looks like this: [cid:1D2A9197-5185-437F-9161-B25A950020AF@xmos.local]

The key here is that the bclk clock block is clocked from the output of the BCLK port (bclk is generated by sending 0xaaaa etc. to the BCLK pin inside i2s). HOWEVER, the design of the chip is such that the input to the bclk clock block comes from the physical bclk pin, NOT an internal signal. i.e. it goes through the buffer to the physical pin and then back inside again. This means that, if you short p_bclk, then the bclk clock block will not get a clock. Or, if the BCLK port has bad reflections on it due to PCB layout issues like stubs, then bclk clock block may will also see extra clocks. I have seen this on prototype hardware where BCLK is badly routed.

I am not 100% sure that this is the reason for the difference between i2s normal/frame IP, but it is an area worth investigating. (I am assuming that you have ruled out timing issues from back-pressure on intrefaces etc, as per previous comments). This is why I strongly suggest to check signal integrity of BCLK using a high speed scope on the customer hardware to see if there are reflections, or if the scope probe capacitive loading improves the situation by dampening out the signal.

Regards, Ed

QuinnWang commented 7 years ago

Hi Ed

“The key here is that the bclk clock block is clocked from the output of the BCLK port (bclk is generated by sending 0xaaaa etc. to the BCLK pin inside i2s). HOWEVER, the design of the chip is such that the input to the bclk clock block comes from the physical bclk pin, NOT an internal signal. i.e. it goes through the buffer to the physical pin and then back inside again. This means that, if you short p_bclk, then the bclk clock block will not get a clock. Or, if the BCLK port has bad reflections on it due to PCB layout issues like stubs, then bclk clock block may will also see extra clocks. I have seen this on prototype hardware where BCLK is badly routed.”

This is interesting, because from above, looks like the I2S normal is susceptible by the bad bclk signal integrity, however, the real case shows only the I2S frame has issue.

Best Regards Quinn

ed-xmos commented 7 years ago

Hi Quinn, If there is a glitch on BCLK then perhaps, in normal I2S case, both master (xmos) and slave (DAC/ADC) all receive so they are still in synch. But in the case of I2S frame, a glitch on BCLK line may cause I2S slaves (DAC/ADC) to receive an extra clock, but not the master (xmos). So in the frame case, they could be out of sync. This is just a theory, but like i say, worth checking in the absence of any better ideas! Regards, Ed

On 8 Feb 2017, at 09:59, QuinnWang notifications@github.com<mailto:notifications@github.com> wrote:

Hi Ed

“The key here is that the bclk clock block is clocked from the output of the BCLK port (bclk is generated by sending 0xaaaa etc. to the BCLK pin inside i2s). HOWEVER, the design of the chip is such that the input to the bclk clock block comes from the physical bclk pin, NOT an internal signal. i.e. it goes through the buffer to the physical pin and then back inside again. This means that, if you short p_bclk, then the bclk clock block will not get a clock. Or, if the BCLK port has bad reflections on it due to PCB layout issues like stubs, then bclk clock block may will also see extra clocks. I have seen this on prototype hardware where BCLK is badly routed.”

This is interesting, because from above, looks like the I2S normal is susceptible by the bad bclk signal integrity, however, the real case shows only the I2S frame has issue.

Best Regards Quinn

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/xmos/lib_i2s/issues/26#issuecomment-278283326, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAmLbTySIJotCLhJUjR8Jff6HQeuzRBmks5raZH8gaJpZM4Lrw9e.

ed-xmos commented 6 years ago

Hi Quinn, is this still an issue (if so we need a repeatable failure case) or can we close it?

QuinnWang commented 6 years ago

This topic can be closed now, thanks very much for your input about the thinking of solving this problem.

Best Regards Quinn

larry-xmos commented 4 years ago

@ed-xmos, can this be closed now?

ACascarino commented 2 years ago

@QuinnWang @ed-xmos looking through all of the I2S issues currently - going to close this issue as it appears to have been resolved; if it does need reopening due to further as-yet-unrecorded-here observations let me know.