Closed mlthlschr closed 1 year ago
Quite interesting problem, I learned something about alignment. Seems like data
s alignment is not compatible with the one required for u32s (to which the slice's pointer is casted in esp_hal_common::spi::Instance::write_bytes
).
For now I did a workaround with a helper struct #[repr(align(32))] struct AlignedBuffer([u8;64])
which aligns correctly. For the meantime it is OK, but would be cool to not depend on that.
Thanks for bringing this up! I also think this hidden alignment requirement is very unfortunate and should get addressed
TL;DR: core::intrinsics::volatile_copy_nonoverlapping_memory::<u32>
(and all it's variants) are broken when len<16
or if any parameter is not aligned (I think). *mut u32::volatile_write
does work (afaik) but requires additional handling since you are always writing a full u32.
Hey was just reading through the code to figure out if QuadSPI/OctalSPI was already supported and found this issue through https://github.com/esp-rs/esp-hal/blob/23690a447217c02be682bdd2519056e661b72f4f/esp-hal-common/src/spi.rs#L1422
I encountered the same issue when implementing SHA
, there I implemented the AlignmentHelper https://github.com/esp-rs/esp-hal/blob/23690a447217c02be682bdd2519056e661b72f4f/esp-hal-common/src/sha.rs#L45 which is effectively just a u32 write buffer. I tried many workarounds (copy,volatile_copy) but the only thing that worked reliably was to use a *mut u32::volative_write
in a loop like in https://github.com/esp-rs/esp-hal/blob/23690a447217c02be682bdd2519056e661b72f4f/esp-hal-common/src/sha.rs#L140-L143. What it seemed like from the assembly is that the small copies are also optimized out (len<16 => no memcpy) and then break.
I think that loop was what we initially had in the SPI driver - only downside IMHO is that in opt-level 0 it might be a bit slow but with opt-level >0 it's totally fine
@MabezDev @jessebraham any thoughts on this? I'd say we should replace volatile_copy_nonoverlapping_memory
in all those places
I think it's probably a good idea to make those changes, yeah. The only downside you listed is, IMO, not even really much of an issue in practice.
Not really expressing much of an opinion, but in esp-idf, they return an error/panic if the buffer is not correctly aligned and doesn't have a length that is a multiple of a word. This pushes the requirements to the user of the API in a clear, albeit slightly annoying way (runtime error).
I actually encountered a miscompilation relating to this.
for byte in frame {
let b = &[byte.reverse_bits()];
spi.write(b)?;
// core::hint::black_box(b);
}
Without the black box, only 0xFF
s get sent.
After adding the black box the correct bytes get sent.
As far as I understand, there's actually three problems with the current code:
volatile
chunk
Thanks for the insights. This definitely needs to get fixed (for all the reasons you mentioned)
Regarding the mis-compilation: on which targets do you see this?
Not really expressing much of an opinion, but in esp-idf, they return an error/panic if the buffer is not correctly aligned and doesn't have a length that is a multiple of a word. This pushes the requirements to the user of the API in a clear, albeit slightly annoying way (runtime error).
I think panicking / returning an error wouldn't be a really good option since those requirements are not requirement by the EH traits and would potentially prevent users from using existing drivers 🤷♂️
For our own APIs however, that might be okay if we document that clearly
Regarding the mis-compilation: on which targets do you see this?
.cargo/config.toml:
[build]
rustflags = [
"-C", "link-arg=-nostartfiles",
"-C", "link-arg=-Wl,-Tlinkall.x",
]
target = "xtensa-esp32s3-none-elf"
[unstable]
build-std = ["core"]
In my case the miscompilation went away after changing the code to:
for i in 0..FIFO_SIZE / 4 {
unsafe {
core::ptr::write_volatile(
(fifo_ptr as *mut u32).add(i),
*(chunk.as_ptr() as *const u32).add(i),
);
}
}
but this does not address the alignment or out of bounds read problem
Hi, I am trying to write a small driver for an SPI device with registers. For that, I have the following function to write the instruction and then read the data:
However, it crashes on an ESP32-C3 with
Do you know what is happening here? I am using not a line of unsafe code (in the code written by me) before calling this method.
I tried to read the register at a different place in the lib using
spi.transfer()
and there it works without problems.Another observation: when I change the code inside the closure to
it does not crash. Using
let cmd = [cmd;1];
however also crashes.Update 1: this observation is not always true. Sometimes it does not work...
Update 2: this also happens when using
transfer_in_place()