Closed LeonardMH closed 5 years ago
I think I have proven that the issue is due to the busy bit being checked too quickly. I have vendored this library so I can edit it and made the following change:
diff --git a/vendor/tm4c123x-hal/src/i2c.rs b/vendor/tm4c123x-hal/src/i2c.rs
index 6772aa3..d39ec3a 100644
--- a/vendor/tm4c123x-hal/src/i2c.rs
+++ b/vendor/tm4c123x-hal/src/i2c.rs
@@ -1,5 +1,7 @@
//! Inter-Integrated Circuit (I2C) bus
+use cortex_m::asm::delay;
+
use tm4c123x::{I2C0, I2C1, I2C2, I2C3};
use crate::gpio::gpioa::{PA6, PA7};
@@ -147,6 +149,8 @@ macro_rules! hal {
.run().set_bit()
});
+ delay(5000);
+
for (i,byte) in (&bytes[1..]).iter().enumerate() {
busy_wait!(self.i2c, busy, bit_is_clear);
With this change, this is what I see on the wire:
Obviously, that delay is ridiculous, but I was still seeing bad results with lower delay values like 10, or 100. Searching for a lower bound now.
It appears that a single nop
is sufficient here (although maybe that is dependent on the slave device). My previous issues with using small delay
values were just due to me not correctly reloading the binary between test runs. Current diff has changed to:
diff --git a/vendor/tm4c123x-hal/src/i2c.rs b/vendor/tm4c123x-hal/src/i2c.rs
index 6772aa3..c002347 100644
--- a/vendor/tm4c123x-hal/src/i2c.rs
+++ b/vendor/tm4c123x-hal/src/i2c.rs
@@ -1,5 +1,7 @@
//! Inter-Integrated Circuit (I2C) bus
+use cortex_m::asm::nop;
+
use tm4c123x::{I2C0, I2C1, I2C2, I2C3};
use crate::gpio::gpioa::{PA6, PA7};
@@ -147,6 +149,8 @@ macro_rules! hal {
.run().set_bit()
});
+ nop();
+
for (i,byte) in (&bytes[1..]).iter().enumerate() {
busy_wait!(self.i2c, busy, bit_is_clear);
I have also changed my write buffer to [0x55, 0, 1, 2, 3]
to better distinguish the first byte from the second. This results in the following trace:
In this trace there is a ~50-100us delay between the ADDR + DATA0 phase and the DATA1 byte that didn't exist in my original 'release' trace; I think that may just be clock stretching by the device that I am talking to (I have changed slave devices since my original report).
If I remove the nop
instruction I lose the first data byte, and the second data byte is delivered in its place, as you can see here:
The delay between ADDR + DATA0 and DATA1 is smaller in this case, but again, I think that is just a function of the device I am talking to responding differently to the DATA0 value.
So, @thejpster from a rust/embedded_hal perspective what is the right way to handle this?
This appears to affect all of the i2c methods in one way or another. I have moved the nop
instruction to be the top line of the busy_wait
macro and that appears to give me much better results. I'm not seeing any performance hit from this, but I think my slave device is the limiting factor right now. I will have to test with a more performant device to fully characterize the impact.
Thank you for the very detail diagnosis. Very impressive!
It sounds like adding a no-op to the busy wait routine is the cleanest fix.
Thanks, I know the pain of vague bug reports and try not to inflict that on others 😃.
I agree that moving the nop into busy_wait is the best choice. I'll open a PR later this week when I get a chance to do some more testing with a faster slave device.
I also wonder how/if the delay required would be affected by higher I2C clock speeds. I'm trying to run at 400kHz but seeing something more like to 180kHz on the wire I think due to the TPR calculation being very conservative, I may try hand tweaking that and retesting. There's also always the possibility that someone is using the "fast plus" mode (1MHz) or the "high speed" mode (3.4Mhz), but I don't have any slave devices that operate at those speeds... I don't expect that the I2C clock speed is actually a factor anyways.
On another note, where is a good place to ask questions related to this crate (+ the tm4c123x crate)? I have a few:
I'd like to contribute some documentation, specifically around:
What would be the best way to contribute this? Add to the README? Expand the rustdoc comments in the sources?
Is it at all possible to use the interrupt!
macro to mark a function as an interrupt handler? I tried every method I could think of to import this from the tm4c123x crate but rust couldn't find it. I also can't find this in the docs, I had to refer to the source code.
I'll file another issue for this, since I'm starting to think this is an actual crate issue, and the correct fix will require some discussion.
In my testing, if I power down the slave device while the TIVA is in the middle of a transaction the SCL line is held low, restoring power to the slave device doesn't help anything. I've confirmed that the TIVA is the one holding the line and that it is stuck attempting to start a new transaction but continuously erroring out. I have a few potential solutions, but nothing clean yet.
Ehh, this is kind of a big question and needs code samples so I'll open another issue for this one.
EDIT: Got this figured out at least for I2C/SPI, GPIOs should be similar I'm sure, so probably nothing to do here. I just needed a little bit better understanding of rust Traits/Generics/Macros.
where is a good place to ask questions related to this crate?
Feel free to open issues, on this crate or on https://github.com/m-labs/dslite2svd.
Is it at all possible to use the interrupt! macro to mark a function as an interrupt handler?
Yes, see https://github.com/thejpster/monotron for an example. Look for TIMER1A and TIMER1B.
Yes, see https://github.com/thejpster/monotron for an example. Look for TIMER1A and TIMER1B.
Awesome, thanks! I was hoping to get SPI and UART working this weekend as well so there is a treasure trove of information there.
I still haven't been able to solve this question on how to use generic GPIOs, and referring to the monotron source I see you have the same problem, for example:
struct Joystick {
up: hal::gpio::gpioc::PC6<hal::gpio::Input<hal::gpio::PullUp>>,
down: hal::gpio::gpioc::PC7<hal::gpio::Input<hal::gpio::PullUp>>,
left: hal::gpio::gpiod::PD6<hal::gpio::Input<hal::gpio::PullUp>>,
right: hal::gpio::gpiod::PD7<hal::gpio::Input<hal::gpio::PullUp>>,
fire: hal::gpio::gpiof::PF4<hal::gpio::Input<hal::gpio::PullUp>>,
}
This is what I have been doing as well, but it feels suboptimal. I want my driver (in this case Joystick
) to be completely generic on the type over a certain GPIO configuration. The first syntax I tried (and what I would consider "nice") was something like:
use hal::gpio;
type PullUpInput = gpio::Input<gpio::PullUp>;
struct Joystick {
up: PullUpInput,
down: PullUpInput,
...
}
impl Joystick {
fn new(up: PullUpInput, down: PullUpInput, ...) -> Joystick {
Joystick { up, down }
}
}
And then let's say you would use this with:
let pe = dp.GPIO_PORTE.split(&sysctl.power_control);
let mut joy = Joystick::new(pe.pe4.into_pull_up_input(), pe.pe5.into_pull_up_input();
But if you wanted to, you could also do this instead without any need to change the definition of Joystick:
let pa = pa.GPIO_PORTA.split(&sysctl.power_control);
let mut joy = Joystick::new(pa.pa5.into_pull_up_input(), pa.pa6.into_pull_up_input();
But the compiler doesn't let me do this, it complains that gpio::Input<gpio::PullUp>
doesn't implement the Sized trait. I also tried implemented this completely generically with something like (sorry going off of memory):
use embedded_hal::digital;
struct Joystick<UP, DN> {
up: UP,
down: DN,
...
}
impl <UP, DN> Joystick<UP, DN>
where UP: digital::InputPin,
DN: digital::InputPin,
{
...
}
But this also didn't work (I can't remember why right now, a similar mechanism works for I2C and SPI just fine), it was also starting to quickly get unwieldy so I gave up.
I'm essentially trying to use the TIVA to interface to two different pieces of hardware that have the same top-level behavior, but use different pinouts for I2C, SPI, and some GPIOs. Being able to specify the hardware layout at one level and then never think about it again would be really nice, but I haven't been able to figure out how to do this for GPIOs so I've resorted to compile time options. Is it possible to do what I'm wanting to do?
Yes, the traits are unsized - you can only use trait references, or concrete type T that implements some trait. I think your last example is the right way to go, but InputPin needs a <MODE>
or something, with an appropriate qualifier on MODE.
I'll open a PR later this week.
Just to follow up on this, do you still plan to submit a PR, or shall I open one?
I still plan to submit a PR. We’re seeing this exact issue on a different TI part and I’m working on getting a more official recommendation on how much delay is needed between setting the RUN bit and checking the BUSY bit.
EDIT: With that said, if you’re trying to bundle fixes for a release then you can probably just go ahead with the single nop as it seems to work fine in my testing so far.
@thejpster, I'm told that 8 clock cycles between setting the RUN bit and checking the BUSY bit should be sufficient. Working on a PR now.
PR https://github.com/thejpster/tm4c-hal/pull/19 opened to resolve this issue.
PR #19 merged
Environment
I am using a Texas Instruments TIVA-C LaunchPad (EK-TM4C123GXL). I have used the cortex-m-quickstart project to bootstrap.
This is my entire main.rs:
This is my .cargo/config:
This is the relevant section of my Cargo.toml:
Problem
With the proper I2C lines instrumented, when I perform
cargo clean && cargo run --release
I see the following waveform for a single transaction (expect:S + ADDR + 0, 0, 1, 2, 3 + P
):It looks great, performance-wise, but it is missing the first byte! When I perform
cargo clean && cargo run
I see the following waveform for a single transaction:It is poor from a performance standpoint, but gets all of the bytes in there.
Suggestions
I don't yet know for sure what is going on here. I'm relatively new to rust and llvm (this feels like maybe something llvm is doing). The I2C code in
tm4c123x_hal::i2c
appears to be correct based on comparison with other C-based Launchpad projects which I know are working.The only lead I have right now will be to start looking at the I2C peripheral. I have seen issues in the past on this peripheral where checking the
busy
bit too soon after setting therun
bit would result in thebusy
bit not being set by the time that I checked it. I've never been able to get an answer on how many cycles are actually required between setting therun
bit and checking thebusy
bit, but right now I suspect that to be the issue.I will be playing around with this some more this weekend and if I come across anything I will update, in the meantime if anyone has suggestions, they would be very much appreciated.