avr-rust / delay

arduino-like delay routines based on busy-wait loops
Apache License 2.0
15 stars 11 forks source link

Fix integer overflow in microseconds delay, typos #16

Closed rkarabut closed 2 years ago

rkarabut commented 3 years ago

delay_ms() and delay_us() functions don't work correctly (check https://www.reddit.com/r/rust/comments/mjkp9v/arduino_pro_mini_avr_delay_delay_ms_doesnt_work/ for a number of complaints), also making the reference avr-rust implementation broken as-is. The reasons seem to include the possible integer overflow at the line 63 at the requested delay of > ~4.3 seconds.

This pull request changes the math in delay_us(), alleviating the problem. As the overflow could still happen with required us >= 2**30, or at about just 1073 seconds, which doesn't look good enough, I also changed the relevant parameters to u64. Tested on a physical Arduino Uno with seconds-range values.

lord-ne commented 3 years ago

I think a better method would be to switch the input to delay to u64, but leave the inputs to delay_ms and delay_us as u32. This would prevent overflow entirely, and still let's you delay for over 50 days using delay_ms.

In fact, we can switch the input to delay to u64, but only actually use 32 bits of it for outer_count (so the least significant 16 bits become last_count, the middle 32 bits are outer_count, and the most significant 16 are ignored), and we still won't be able to overflow if we leave the inputs to delay_ms and delay_us as u32 (as long as the clock speed is less than 262 MHz, which I believe is higher than any AVR microcontroller uses). Setting delay itself to private, or at least nesting it in some module, would prevent misuse.

rkarabut commented 3 years ago

This isn't correct. With a u32 input to delay_us it would still be restricted to 2**32 / 1000000 = 4294 seconds which is only over an hour. delay_ms could be left at u32, but I feel it would be better semantically to also change it to u64 for consistency. Also, delay shouldn't be set to private as it's bound to be used externally (and in fact is used already, in ruduino). It's explicitly mentioned as an exported function.

In fact, we can switch the input to delay to u64, but only actually use 32 bits of it for outer_count (so the least significant 16 bits become last_count, the middle 32 bits are outer_count, and the most significant 16 are ignored), and we still won't be able to overflow if we leave the inputs to delay_ms and delay_us as u32 (as long as the clock speed is less than 262 MHz, which I believe is higher than any AVR microcontroller uses).

I'm afraid I don't understand the reasoning for this.

lord-ne commented 3 years ago

This isn't correct. With a u32 input to delay_us it would still be restricted to 2^32 / 1000000 = 4294 seconds which is only over an hour.

Right, but users could get a longer delay by using delay_ms instead of delay_us, up to 2^32/1,000 seconds (actually [2^32-1]/1000). We would just need to make sure that delay_ms directly calls delay instead of going through delay_us.

Also, delay shouldn't be set to private as it's bound to be used externally (and in fact is used already, in ruduino).

All ruduino does is re-export the module, I don't think it's actually used internally.

It's explicitly mentioned as an exported function.

I understand that making it private would be a breaking API change (although then again, so is changing its argument type). I do agree with you that it should probably still be available to end users, but at the same time I don't think it should be in the top-level module. delay is a pretty ambiguously named function (it has the same name as the Arduino function which does what our delay_ms does), and I'm worried that people accidentally using this internal function could be a source of bugs. Since we're already making a breaking API change by changing its argument type, I think it's a good opportunity to move delay to some kind of internal_delay_functions module. We can leave it public, but this makes it clearer that users generally shouldn't need to use it (I can't really think of any examples where they would need to use it).

I'm afraid I don't understand the reasoning for this.

If we make delay, delay_ms, and delay_us all use u64s, we still potentially have a problem with overflow if the user passes in a massive u64 to delay_ms, or delay_us for some reason. Since there's no need for delays longer than

By making delay use a u64 instead of a u32, we're introducing more overhead to our function, overhead which makes small delays (on the order of a few microseconds) less accurate. On the other hand, if the argument is a u64 but we only actually loop over a u32,then that isn't as much overhead. There's not reason that delay should ever be called with an argument greater than (2^48 - 1), since even (2^32 - 1) milliseconds won't be more than (2^48 - 1) iterations, and we don't expect users to be calling this function directly very often. And since the outer loop doesn't care about the lower 16 bits either (since those go to last_count), we don't actually need to loop over the full u64, a u32 is sufficient. This eliminates some of the overhead, which is good.

tl;dr By ignoring the top 16 bits, we have less overhead, and it's fine to ignore them since: a) we don't actually need those top 16 bits when calling from delay_ms and delay_us, and b) we don't expect users to be calling this function directly unless they know what they're doing.

lord-ne commented 3 years ago

To get a better sense of what I mean, could you take a look at my pull request? Here: https://github.com/avr-rust/delay/pull/17

rkarabut commented 3 years ago

I disagree with most of this, except for the API changes being breaking (on that a bit further). In particular, I don't think your argument for making delay parameter a compound is sound, and I don't see any real reason to do this. What we probably should do if saving the API at this point is indeed deemed important is make another exported function (maybe delay_long or something) in addition to the existing one so that people who need delays longer than u32 could still use it, and document it as such.

If you indeed registered a noticeable overhead for small delays while calling delay(u64), we then should use the internal u64 function in delay_ms/delay_us only when the argument would actually exceed the u32 limit (which should make the overhead negligible anyway). The delay function itself doesn't really need any major changes otherwise.

Also, if the delay_ms/delay_us u32 signatures also should stay for backwards compatibility, the possibility of the overflow in them should be explicitly denied, probably clamping the value to the maximum possible one instead, and this as well should be documented.

lord-ne commented 3 years ago

making delay parameter a compound is sound

What do you mean by a compound? I'm not familiar with that terminology.

make another exported function (maybe delay_long or something) in addition to the existing one so that people who need delays longer than u32 could still use it, and document it as such.

I really can't think of a scenario where a user would actually want to call delay directly, rather than delay_ms or delay_us, especially for a really long delay. That's why I think it should be deprecated at some point. So I don't think there's a need for

Also, if the delay_ms/delay_us u32 signatures also should stay for backwards compatibility, the possibility of the overflow in them should be explicitly denied, probably clamping the value to the maximum possible one instead, and this as well should be documented.

If our internal delay uses 48 bits of a u64, then it's already impossible* for delay_ms/delay_us to overflow if they only accept u32s.

*(262 MHz, the clock speed needed to cause an overflow, is about five to ten times as fast as any 8-bit AVR chip is actually capable of operating)


Basically, I want to think of the internal delay function (the one taking a u64 argument) as purely an internal function, and so because of that I don't see a reason to actually loop using a u64 when we only need to use a u32 because we know that delay_ms/delay_us can never pass in a number greater than (2^48 - 1).

Having said that, I think it would be more consistent with my position to actually make the internal functions totally private, instead of just in a public module, and only leave the old u32 delay function exposed. So I'll do that in my PR.

rkarabut commented 3 years ago

Ah, okay, I did in fact misunderstand your explanation of the delay(u64) function, sorry about that. Right, in case all the external functions stay at accepting u32 this would be fine, as the limits on the inputs would be expected. I do agree with you on leaving the API completely as is and excluding the possibility of the overflow this way. This PR should be closed when yours is accepted then.

stappersg commented 2 years ago

I'm about to merge this. However merge conflict prevented it. Manual labor got us to branch karabut.

@rkarabut is it OKay that the reworked merge happens?

It are these commits

stappers@myhost:~/src/rust/RustAVR/delay
$ git format-patch master
0001-typos.patch
0002-Fix-microsecons-delay.patch
0003-Expand-count-parameters-to-u64.patch
stappers@myhost:~/src/rust/RustAVR/delay
$ head -n 200 *.patch
==> 0001-typos.patch <==
From 78d0cece30416de0dde7a156d72acc46148faad1 Mon Sep 17 00:00:00 2001
From: Ratmir Karabut <rkarabut@gmail.com>
Date: Wed, 8 Jun 2022 16:40:12 +0200
Subject: [PATCH 1/3] typos

---
 src/lib.rs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/lib.rs b/src/lib.rs
index c48621a..dd936c2 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -50,7 +50,7 @@ pub fn delay(count: u32) {
     }
 }

-///delay for N miliseconds
+///delay for N milliseconds
 /// # Arguments
 /// * 'ms' - an u32, number of milliseconds to busy-wait
 #[inline(always)]
@@ -62,7 +62,7 @@ pub fn delay_ms(ms: u32) {

 ///delay for N microseconds
 /// # Arguments
-/// * 'ms' - an u32, number of microseconds to busy-wait
+/// * 'us' - an u32, number of microseconds to busy-wait
 #[inline(always)]
 pub fn delay_us(us: u32) {
     // picoseconds
-- 
2.36.1

==> 0002-Fix-microsecons-delay.patch <==
From 790d9e996140239bad70b38a7394d84dac164a42 Mon Sep 17 00:00:00 2001
From: Ratmir Karabut <rkarabut@gmail.com>
Date: Wed, 8 Jun 2022 18:39:59 +0200
Subject: [PATCH 2/3] Fix microsecons delay

---
 src/lib.rs | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/lib.rs b/src/lib.rs
index dd936c2..eee5c3c 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -65,10 +65,8 @@ pub fn delay_ms(ms: u32) {
 /// * 'us' - an u32, number of microseconds to busy-wait
 #[inline(always)]
 pub fn delay_us(us: u32) {
-    // picoseconds
-    let ps = us * 1000;
-    let ps_lp = 1000000000 / (avr_config::CPU_FREQUENCY_HZ / 4);
-    let loops = (ps / ps_lp) as u32;
+    let us_lp = avr_config::CPU_FREQUENCY_HZ / 1000000 / 4;
+    let loops = (us * us_lp) as u32;
     delay(loops);
 }

-- 
2.36.1

==> 0003-Expand-count-parameters-to-u64.patch <==
From 1c4d61565e36708447ba790f84a7f81b41b05c6c Mon Sep 17 00:00:00 2001
From: Ratmir Karabut <rkarabut@gmail.com>
Date: Wed, 8 Jun 2022 18:53:06 +0200
Subject: [PATCH 3/3] Expand count parameters to u64

to accomodate values > ~1000s
---
 src/lib.rs | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/lib.rs b/src/lib.rs
index eee5c3c..da57eb3 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -25,9 +25,9 @@ use core::arch::asm;

 /// Internal function to implement a variable busy-wait loop.
 /// # Arguments
-/// * 'count' - an i32, the number of times to cycle the loop.
+/// * 'count' - a u64, the number of times to cycle the loop.
 #[inline(always)]
-pub fn delay(count: u32) {
+pub fn delay(count: u64) {
     // Our asm busy-wait takes a 16 bit word as an argument,
     // so the max number of loops is 2^16
     let outer_count = count / 65536;
@@ -52,9 +52,9 @@ pub fn delay(count: u32) {

 ///delay for N milliseconds
 /// # Arguments
-/// * 'ms' - an u32, number of milliseconds to busy-wait
+/// * 'ms' - an u64, number of milliseconds to busy-wait
 #[inline(always)]
-pub fn delay_ms(ms: u32) {
+pub fn delay_ms(ms: u64) {
     // microseconds
     let us = ms * 1000;
     delay_us(us);
@@ -62,11 +62,11 @@ pub fn delay_ms(ms: u32) {

 ///delay for N microseconds
 /// # Arguments
-/// * 'us' - an u32, number of microseconds to busy-wait
+/// * 'us' - an u64, number of microseconds to busy-wait
 #[inline(always)]
-pub fn delay_us(us: u32) {
-    let us_lp = avr_config::CPU_FREQUENCY_HZ / 1000000 / 4;
-    let loops = (us * us_lp) as u32;
+pub fn delay_us(us: u64) {
+    let us_in_loop = (avr_config::CPU_FREQUENCY_HZ / 1000000 / 4) as u64;
+    let loops = us * us_in_loop;
     delay(loops);
 }

-- 
2.36.1

stappers@myhost:~/src/rust/RustAVR/delay
$ 
rkarabut commented 2 years ago

@stappersg Fine by me 👍

lord-ne commented 2 years ago

I still have my concerns about this PR that I expressed last year. Namely, there's no point making delay_ms and delay_ms take u64 arguments, since that doesn't actually prevent overflow (since they're public functions, the user can just pass in massive numbers if they feel like it). Instead, they should both take a u32 argument, and then call a private internal delay function that takes a u64, like in my PR.

stappersg commented 2 years ago

On Thu, Jun 09, 2022 at 01:03:16AM -0700, lord-ne wrote:

I still have my concerns about this PR ... like in my PR.

Acknowledge on that.

For what it is worth:

lord-ne commented 2 years ago

@stappersg Okay, I understand. Just be aware that I fixed all the merge conflicts in #17 last night.

stappersg commented 2 years ago

Just be aware that I fixed all the merge conflicts in https://github.com/avr-rust/delay/pull/17 last night.

That work is now in branch rewrite.

Fine by me +1

Merged and pushed, thanks for your contribution to this project.