Closed weicaiyang closed 9 months ago
To produce this
change this line to sck_polarity_phase inside {'b11};)
run the spi_device_flash_all
with seed = 4109906088. ~60% seeds fail.
Thanks. This is a bug. And I think we cannot fix this easily without sacrificing the timing.
The clock gating cell blocks the clock when the clock is 0 to not create a short pulse. So the logic drops the clock (if filtered) when 8th low of SCK comes. then it remains as it is. The filter information is reset by CSb de-assertion.
However, as CPHA, CPOL are 1, the off SCK is 1, not 0. So, clock enable signal affects a the next SCK low period. That is the waveform shown above.
In order to solve this, we need to invert the SCK then apply filter (gating), then re-invert. I honestly am not sure how much this will affect to the timing.
Thanks. This is a bug. And I think we cannot fix this easily without sacrificing the timing.
The clock gating cell blocks the clock when the clock is 0 to not create a short pulse. So the logic drops the clock (if filtered) when 8th low of SCK comes. then it remains as it is. The filter information is reset by CSb de-assertion.
However, as CPHA, CPOL are 1, the off SCK is 1, not 0. So, clock enable signal affects a the next SCK low period. That is the waveform shown above.
In order to solve this, we need to invert the SCK then apply filter (gating), then re-invert. I honestly am not sure how much this will affect to the timing.
perhaps other designers can also chime in. @jeoongp @tjaychen
I reviewed and prepared a fix for the issue. Commit is here.
However, I think still passthrough has more issues inside, how to handle the CPOL := 1 case. With the fix, still I see UVM_ERRORs. Am debugging now.
By the way, the fix introduces 2x inverter + 1 AND logic delay onto SCK path. If CPOL is 0, only AND gate delay is added.
I agree with @eunchan. For timing, I think 2x inverters + 1 AND logic delay might be OK. But I am not clear of errors under debugging. I hope it would not affect downstream inside.
synthesis / timing will tell us! i think we have constraints for passthrough, which is usually our toughest path :)
Does the SCK passthrough path need any of that? If we're passing it through, it shouldn't need anything except the one filtering gate, I'd think. Why bother inverting twice in a row for this combinatorial path that shouldn't change, regardless of CPOL and CPHA settings? (In other words, the passed through clock is logically separate from the internal sampling clock produced from the sck_i DIO.)
@a-will (@eunchan let me know if i misunderstood), i was just drawing this out, i think the issue are basically with the cpha/cpol combinations that capture on the negative edge of the clock.
The spi device filtering relies on observing the 8th data bit and then making the clock inactive. But our clock gating cell right now only gates to 0, if we're sampling on negative edge, the gating i think actually creates the sampling condition.
I think the double inversion (probably based on mode?) mostly gets around that problem. The other solution would be to check if we have a CG cell that gates high (i think some libraries have it) and the other would be to do the gating with primitive cells ourselves, which I don't think would be a horrible choice either. We would then however still need to statically select which gating path we're picking, which might create similar kinds of delay in the path.
actually talking to @a-will offline, he had a really interesting idea i never thought about before. What if instead of explicitly gating, we manipulate the internal pull-down/up and output enable?
So basically, we could pass the clock directly through all the time, along with a "disable" indication to spi_host
.
The spi_host, upon seeing the disable, could just "disable" the output enable drive of the clock pin.
Then as long as we statically set the internal pull-up/down to match the mode when initializing, we could get rid of the clock gating cell completely.
This of course adds a little complexity to block level DV, but it seems workable too, and would not rely on specific libraries like in my decription.
@eunchan what do you think?
that's really good I think.
Also, I happen to find that our dedicated IO has invert
mode there. We can use that for CPOL := 1 case rather than inverting inside SPI_DEVICE. ( Added: I meant SPI_DEV_CLK input pad)
CC: @msfschaffner
actually talking to @a-will offline, he had a really interesting idea i never thought about before. What if instead of explicitly gating, we manipulate the internal pull-down/up and output enable?
So basically, we could pass the clock directly through all the time, along with a "disable" indication to
spi_host
. The spi_host, upon seeing the disable, could just "disable" the output enable drive of the clock pin.Then as long as we statically set the internal pull-up/down to match the mode when initializing, we could get rid of the clock gating cell completely.
This of course adds a little complexity to block level DV, but it seems workable too, and would not rely on specific libraries like in my decription.
we have an output to spi_host - passthrough_en
. Will this serve as the "disable" indicator? So, instead of gating SIO and clock, we just turn this off when spi_device wants to filter the transaction?
i think we probably can't use that one, since that one is used more for the sw / passthrough collision. I thought that one was dropped when we hit a busy case, and we need the software to take over spi_host. We might need a separate indication.
i think we probably can't use that one, since that one is used more for the sw / passthrough collision. I thought that one was dropped when we hit a busy case, and we need the software to take over spi_host. We might need a separate indication.
spi_device_pkg::passthrough_req_t
has sck_en
which is output enable port of the PAD. we can use this.
@tjaychen Do you want me to move on with Alex's idea? That way, all SPI_DEVICE needs to do is to drop the output enable when clock supposes to be in reset position (0 if CPOL :=0, 1 if CPOL := 1). We may need a latch for output enable though.
@eunchan that sounds good to me. I remember thinking we could have problems with glitches, but is that why you're adding a latch for output enable?
@eunchan i think for this has already been merged right?
Yep. @weicaiyang could you please re-run the test and the issues are gone?
@eunchan
As discussed offline, the current fix doesn't work.
In case you want to rerun, change 0
to 3
in this line and run test - spi_device_pass_cmd_filtering
Thanks for checking ! as discussed, it looks like filter
does not maintain its value until spi_in
clock posedge. I will check and try to think what is a good & clean solution.
I think current architecture cannot support mode 3 fully. I will summarize the issue in the doc and share here later. Anyhow, it is quite hard to support sck clock gating in mode 3 due to the data bit is half clock later..
In essence, as data is shifted by half SCK, gating and enable SCK to downstream flash device is not possible for op-complete SPI Flash command.
It is not possible to gating the SCK then release CSb and enable SCK. CG to high may solve the latter (enable SCK) but it cannot meet the gating time as ‘CG to high’ enable affects only when SCK is high. That time already passed 8th SCK posedge.
I overlooked this scenario and only considered Mode 0 when designing the passthrough filtering scheme. It is my fault.
Option at this stage is "Drop mode 3 support in SPI_DEVICE". Re-designing passthrough to filter commands in mode 3 requires overhaul of the passthrough logic (CG cannot be used at all). So maybe next version can do that but it is too late at this moment.
Thanks @eunchan for clearly explaining this. Are we OK to not support mode3? @a-will @tjaychen
@eunchan / @weicaiyang sorry i'm not sure if i'm understanding right. But in mode 3, since it is negative edge launch and positive edge capture, wouldn't we want to gate low? So looking at the last negative edge (where bit 8 is launched), wouldn't we want to look at the value and then gate the clock to 0?
let me know what i misunderstood.
Is this another thing that could be aided by pad attributes? Can we make use of the inverters at both ends for mode 3? That would essentially convert mode 3 to mode 0 internally, right?
Is this another thing that could be aided by pad attributes? Can we make use of the inverters at both ends for mode 3? That would essentially convert mode 3 to mode 0 internally, right?
It isn't unfortunately. Let me create a doc to explain @tjaychen @a-will
thanks @eunchan
https://docs.google.com/document/d/1noHZ11DhFznM4cfUb55kFSR-IQvatkZDVrTgB9lIjVI/edit?usp=sharing
Can anyone grant me the access to this file?
Triaged for spi_device
. Assigning to M2.5 with https://github.com/lowRISC/opentitan/labels/Priority%3AP1 until we have confirmed with the product team that supporting mode 3 is not required. After that, we can defer this to https://github.com/lowRISC/opentitan/labels/Type%3AFutureRelease.
@cindychip has confirmed that Mode 3 is not required for M2.5 and has already updated the doc accordingly in PR #17240.
Thanks!
What's special about mode 3: SCK doesn't return to 0 at the end of the transaction, so prim_clock_gating's latch doesn't open up to changes. Consequently, a filtered command would not allow SCK to return to the required idle value of mode 3 (high, not low!).
Updated this issue to P1 so that it shows up in issue triage. I am following with use case stakeholders to determine if we can drop support for SPI-mode-3.
We may not be able to support mode 3 safely. It seems there is an insufficient number of edges to process the information, return to idle, and satisfy timing requirements for the keep-out zones around CSB edges at the downstream SPI flash.
I think we should be able to tolerate a mode 3 host, though, so long as the SPI flash device also doesn't care about the SPI_CLK idle value.
So, for both modes, we cut off the 8th posedge, and the clock gate's en_i
will be held low until the next CSB falling edge. Then, the clock gate's output will remain low until the first posedge of SPI_CLK. For mode 0, that initial output matches the ordinary idle value, but for mode 3, idle usually is high. Because the clock gate has glitch protection that prevents its output from changing while the clock input is high, mode 3's idle value does not pass through.
If the SPI flash device can handle this mixing of idle values, then the host can be mode 0 or mode 3. It's a transitive sort of tolerance.
TLDR; SCK doesn't return to idle for filtered commands in mode 3
I found we didn't set phase/polarity correctly in the device side of the SPI agent. When I tried to fix it, this issue came up.
In this waveform, the first sck is driven by host. The 2nd sck is what downstream sees. The
en_latch
is set too late and downstream misses the first falling edge of sck. Theen_latch
is affected by the previous command, depending on whether is filtered.It will be good to test the spi_device with a real flash model on the downstream port, which ensures our design compliant with the SPI flash protocol. Maybe we can do this in V3.
Command to open the waveform session: