opsroller / arduino

Automatically exported from code.google.com/p/arduino
Other
0 stars 0 forks source link

Optimized digitalWrite() function. #140

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
From Paul Stoffregen:

#define digitalPinToPort(P) \
      (((P) >= 0 && (P) <= 7) ? &PORTD : (((P) >= 8 && (P) <= 13) ? &PORTB : &PORTC))

#define digitalPinToBit(P) \
      (((P) >= 0 && (P) <= 7) ? (P) : (((P) >= 8 && (P) <= 13) ? (P) - 8 : (P) - 14))

#define digitalWrite(P, V) \
      ((__builtin_constant_p(P) && __builtin_constant_p(V)) \
      ? bitWrite(*((volatile unsigned char *) digitalPinToPort(P)), digitalPinToBit(P), (V)) \
      : _digitalWrite((P), (V)))

static inline void digitalWrite(uint8_t, uint8_t) __attribute__((always_inline, 
unused));
static inline void digitalWrite(uint8_t P, uint8_t V)
{
      (__builtin_constant_p(P) && __builtin_constant_p(V))
      ? bitWrite(*((volatile unsigned char *) digitalPinToPort(P)), digitalPinToBit(P), V)
      : _digitalWrite(P, V);
}

Original issue reported on code.google.com by dmel...@gmail.com on 11 Nov 2009 at 9:17

GoogleCodeExporter commented 9 years ago

#if !defined(__AVR_ATmega1280__)

        // Standard Arduino Pins
#define digitalPinToPortReg(P) \
        (((P) >= 0 && (P) <= 7) ? &PORTD : (((P) >= 8 && (P) <= 13) ? &PORTB : &PORTC))
#define digitalPinToBit(P) \
        (((P) >= 0 && (P) <= 7) ? (P) : (((P) >= 8 && (P) <= 13) ? (P) - 8 : (P) - 14))

#if defined(__AVR_ATmega8__)

        // 3 PWM
#define digitalPinToTimer(P) \
        (((P) ==  9 || (P) == 10) ? &TCCR1A : (((P) == 11) ? &TCCR2 : 0))
#define digitalPinToTimerBit(P) \
        (((P) ==  9) ? COM1A1 : (((P) == 10) ? COM1B1 : COM21))
#else

        // 6 PWM
#define digitalPinToTimer(P) \
        (((P) ==  6 || (P) ==  5) ? &TCCR0A : \
        (((P) ==  9 || (P) == 10) ? &TCCR1A : \
        (((P) == 11 || (P) ==  3) ? &TCCR2A : 0)))
#define digitalPinToTimerBit(P) \
        (((P) ==  6) ? COM0A1 : (((P) ==  5) ? COM0B1 : \
        (((P) ==  9) ? COM1A1 : (((P) == 10) ? COM1B1 : \
        (((P) == 11) ? COM2A1 : COM2B1)))))
#endif

#else
        // Arduino Mega Pins
#define digitalPinToPortReg(P) \
        (((P) >= 22 && (P) <= 29) ? &PORTA : \
        ((((P) >= 10 && (P) <= 13) || ((P) >= 50 && (P) <= 53)) ? &PORTB : \
        (((P) >= 30 && (P) <= 37) ? &PORTC : \
        ((((P) >= 18 && (P) <= 21) || (P) == 38) ? &PORTD : \
        ((((P) >= 0 && (P) <= 3) || (P) == 5) ? &PORTE : \
        (((P) >= 54 && (P) <= 61) ? &PORTF : \
        ((((P) >= 39 && (P) <= 41) || (P) == 4) ? &PORTG : \
        ((((P) >= 6 && (P) <= 9) || (P) == 16 || (P) == 17) ? &PORTH : \
        (((P) == 14 || (P) == 15) ? &PORTJ : \
        (((P) >= 62 && (P) <= 69) ? &PORTK : &PORTL))))))))))
#define digitalPinToBit(P) \
        (((P) >=  7 && (P) <=  9) ? (P) - 3 : \
        (((P) >= 10 && (P) <= 13) ? (P) - 6 : \
        (((P) >= 22 && (P) <= 29) ? (P) - 22 : \
        (((P) >= 30 && (P) <= 37) ? 37 - (P) : \
        (((P) >= 39 && (P) <= 41) ? 41 - (P) : \
        (((P) >= 42 && (P) <= 49) ? 49 - (P) : \
        (((P) >= 50 && (P) <= 53) ? 53 - (P) : \
        (((P) >= 54 && (P) <= 61) ? (P) - 54 : \
        (((P) >= 62 && (P) <= 69) ? (P) - 62 : \
        (((P) == 0 || (P) == 15 || (P) == 17 || (P) == 21) ? 0 : \
        (((P) == 1 || (P) == 14 || (P) == 16 || (P) == 20) ? 1 : \
        (((P) == 19) ? 2 : \
        (((P) == 5 || (P) == 6 || (P) == 18) ? 3 : \
        (((P) == 2) ? 4 : \
        (((P) == 3 || (P) == 4) ? 5 : 7)))))))))))))))

        // 15 PWM
#define digitalPinToTimer(P) \
        (((P) == 13 || (P) ==  4) ? &TCCR0A : \
        (((P) == 11 || (P) == 12) ? &TCCR1A : \
        (((P) == 10 || (P) ==  9) ? &TCCR2A : \
        (((P) ==  5 || (P) ==  2 || (P) ==  3) ? &TCCR3A : \
        (((P) ==  6 || (P) ==  7 || (P) ==  8) ? &TCCR4A : \
        (((P) == 46 || (P) == 45 || (P) == 44) ? &TCCR5A : 0))))))
#define digitalPinToTimerBit(P) \
        (((P) == 13) ? COM0A1 : (((P) ==  4) ? COM0B1 : \
        (((P) == 11) ? COM1A1 : (((P) == 12) ? COM1B1 : \
        (((P) == 10) ? COM2A1 : (((P) ==  9) ? COM2B1 : \
        (((P) ==  5) ? COM3A1 : (((P) ==  2) ? COM3B1 : (((P) ==  3) ? COM3C1 : \
        (((P) ==  6) ? COM4A1 : (((P) ==  7) ? COM4B1 : (((P) ==  8) ? COM4C1 : \
        (((P) == 46) ? COM5A1 : (((P) == 45) ? COM5B1 : COM5C1))))))))))))))

#endif

#define digitalWrite(P, V) \
        if (__builtin_constant_p(P) && __builtin_constant_p(V)) { \
                if (digitalPinToTimer(P)) \
                        bitClear(*digitalPinToTimer(P), digitalPinToTimerBit(P)); \
                bitWrite(*digitalPinToPortReg(P), digitalPinToBit(P), (V)); \
        } else { \
                _digitalWrite((P), (V)); \
        }

Original comment by paul.sto...@gmail.com on 18 Nov 2009 at 1:07

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
This can be handled as a library addition with a .h file without a .c file
see http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1267553811/0

Original comment by johnrain...@gmail.com on 4 Mar 2010 at 11:46

GoogleCodeExporter commented 9 years ago
This may be digitalWrite() itself or an additional function.

Original comment by dmel...@gmail.com on 6 May 2010 at 6:38

GoogleCodeExporter commented 9 years ago
Is there any reason not to simply replace the current digitalWrite() with this 
implementation for 1.0?  

If anyone really needs the extra speed gained by not disabling PWM on the pin, 
they could always call bitWrite(*digitalPinToPortReg(P), digitalPinToBit(P), 
(V)) directly.  Which means we wouldn't need a separate digitalWriteFast() or 
equivalent, right?

Original comment by dmel...@gmail.com on 4 Aug 2010 at 4:49

GoogleCodeExporter commented 9 years ago
See also: http://code.google.com/p/digitalwritefast/downloads/list

Original comment by dmel...@gmail.com on 21 Oct 2010 at 12:28

GoogleCodeExporter commented 9 years ago
Here's a patch from Álvaro Lopes for this.

Original comment by dmel...@gmail.com on 27 Dec 2010 at 5:36

Attachments:

GoogleCodeExporter commented 9 years ago
Álvaro has a new patch for this on the developers list, which I think is 
good-enough as a basic approach.

Original comment by dmel...@gmail.com on 7 Jan 2011 at 4:07

GoogleCodeExporter commented 9 years ago
I'm not sure if this is a bug on patch or on my code, so I'm writing here but 
also I have a forum thread if it's not a bug on patch:
http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1294597977/0

I patched my arduino core and I'm getting very strange things happening.

My led matrix code is very, very fast but my push buttons are crazy(it gets 
pressed without I press anything). I only set pinMode(3, INPUT) on setup() and 
if (digitalRead(3) == HIGH) on loop(). I have a debounce set to 200ms inside 
the if.

If I unpatch my code runs slow again(hehe) but works flawless.

Slow I mean my scrolling function on led matrix runs very slow.

Original comment by wsart...@gmail.com on 9 Jan 2011 at 7:09

GoogleCodeExporter commented 9 years ago
https://github.com/arduino/Arduino/commit/9dccd634c5ff1fed88459e39b277794dbba039
fb
https://github.com/arduino/Arduino/commit/aa1f1cbda9d6bb52785f98b40746920853d657
9b

The implementations for digitalWrite(), digitalRead(), and pinMode()
are now macros that are either inlined into the user sketch (if the
pin is a compile-time constant), or used to generate the body of the
function version of the command (non-constant pin).  For example,
digitalWrite() is:

#define digitalWrite_implementation(pin, value)\
do {\
       uint8_t oldSREG;\
       uint8_t bit = digitalPinToBitMask(pin);\
       uint8_t port = digitalPinToPort(pin);\
       volatile uint8_t *reg = portOutputRegister(port);\
\
       if (!__builtin_constant_p(pin) || registerWriteNeedsLocking(reg)) {\
               oldSREG = SREG;\
               cli();\
       }\
\
       if (value == LOW) {\
               *reg &= ~bit;\
       } else {\
               *reg |= bit;\
       }\
\
       if (!__builtin_constant_p(pin) || registerWriteNeedsLocking(reg)) {\
               SREG = oldSREG;\
       }\
} while(0)

INLINED void digitalWrite(uint8_t pin, uint8_t value)
{
       if (__builtin_constant_p(pin)) digitalWrite_implementation(pin, value);
       else digitalWrite_lookup(pin, value);
}

void digitalWrite_lookup(uint8_t pin, uint8_t val)
{
       digitalWrite_implementation(pin, val);
}

Because digitalPinToBitMask(), digitalPinToPort(), and
portOutputRegister() can choose between compile-time and run-time
lookup, the same digitalWrite_implementation() works for either case.

Here are the cases and resulting implementations:

1. pin & value are both compile-time constants -> a single cbi or sbi statement

       sbi 37-32,2

2. pin & value are both compile-time constants, but on registers that
don't work with sbi / cbi -> an inline load, or/and, store (with
interrupts disabled)

       in r25,__SREG__
       cli
       ldi r30,lo8(267)
       ldi r31,hi8(267)
       ld r24,Z
       ori r24,lo8(1)
       st Z,r24
       out __SREG__,r25

3. pin is a compile-time constant, value isn't -> cbi or sbi statement
with an inline test

       lds r24,x
       tst r24
       brne .L4
       cbi 37-32,2
       rjmp .L5
.L4:
       sbi 37-32,2
.L5:

4.  pin is a compile-time constant, but on registers that don't work
with sbi / cbi, value is not constant -> combination of cases 2 & 3.

       in r25,__SREG__
       cli
       lds r24,x
       tst r24
       brne .L4
       lds r24,267
       andi r24,lo8(-2)
       rjmp .L7
.L4:
       lds r24,267
       ori r24,lo8(1)
.L7:
       sts 267,r24
       out __SREG__,r25

5. the pin isn't a compile-time constant -> call the
digitalWrite_lookup() function (whose body is provided by the
digitalWrite_implementation() macro)

Original comment by dmel...@gmail.com on 12 Feb 2011 at 7:14

GoogleCodeExporter commented 9 years ago
Uhm...

This issue says "Milestone-1.0".
Arduino 1.0 is out.
However, the fix is definitely not in it.

What happened?

Original comment by rn3ao...@gmail.com on 1 Dec 2011 at 2:15

GoogleCodeExporter commented 9 years ago
I merged this change, but then realized that there wasn't a good way for 
someone to pass a pin number to a library at compile time, making the optimized 
functions only useful for hard-coded pin numbers in libraries or for use from 
sketches.  That made the complication of providing two separate versions of the 
pin mapping seem like a bigger burden than the potential benefit.  

If, however, there was a good way to pass pin numbers from the sketch to a 
library at compile time, then I think it would make sense to include this 
optimization.  One possibility is the approach in issue 27, but that has its 
own problems.  Recently, someone suggested using templates to pass these pin 
numbers to libraries at compile time.  I think that could be a great option.  
Does anyone have experience with that approach?  Can we use macros to hide the 
template syntax?  

Original comment by dmel...@gmail.com on 27 Dec 2011 at 8:11

GoogleCodeExporter commented 9 years ago
Even if it can't be used in non-const cases, this is still a very useful 
optimization.  Many sketches benefit.  Some libraries, like Ethernet, which 
require speed hard-code direct register access.  Wouldn't using the actual 
Arduino pin numbers be a worthwhile improvement?

Regarding libraries, the API matters.  Libraries which take pin numbers at 
runtime in a begin() function are incompatible with any compile-time 
optimization.  Only when the pin number is part of the constructor is compile 
time optimization possible.  Fortunately, many libraries are designed that way, 
but some, like SD, get pin numbers from a begin() or similar function at 
runtime.

There is some hope that future compilers could do this sort of optimization 
automatically.  Issue 660 might be one way?  It may be some time until such 
features work reliably on AVR or C++ code, but obviously gcc is developing in 
that direction.

I believe the template approach appears in Oleg's USB Host Shield library.  It 
contains a lot of complex template code which abstracts the pins.  I do not 
know if it's possible to use this in the way you want.  Perhaps someone who's a 
C++ templates expert could find a way?  My initial impression is an API where 
the pins are specified as numeric constants seems unlikely.  Pre-defined object 
names, like "pin6" would probably be needed.  Sadly, it's beyond my limited C++ 
template skill.

As a practical matter, I don't think anyone is going to create a complete and 
comprehensive solution for optimizing all pin access in all circumstances which 
you can apply to Arduino as a single commit, or a group of small commits on the 
same day.  Incremental improvements over time are much more likely.  Having the 
solution in an officially released version of Arduino, rather than languishing 
here in the issue tracker (now for over 2 years), might inspire someone to do 
further work to apply the optimization to more cases.  It would also motivate 
authors who need speed to at least use digitalWrite with numeric constants, 
rather than hard coding AVR centric register access.

In fact, had this been applied 2 years ago, today many of the libraries that 
hard-code AVR registers would very likely have hard-coded digitalWrite with 
numeric constants and therefore be portable to ARM.

Original comment by paul.sto...@gmail.com on 28 Dec 2011 at 2:53

GoogleCodeExporter commented 9 years ago
Here is a freshly written page that demonstrates the need for this optimization.

http://learn.adafruit.com/ir-sensor/using-an-ir-sensor

At the top of this code are these lines:

  #define IRpin_PIN PIND
  #define IRpin 2

If Arduino had this optimiztion, Limor probably would have written 
"digitalRead(2)", or "digitalRead(IRpin)".  But because digitalRead() is so 
notoriously slow, she resorted to direct register-level I/O.

The result is Uno-only code, **still** being contributed to the universe of 
Arduino examples in late-2012.  It won't work directly on Leonardo, it won't 
work directly on Mega, and it won't work at all on Due.  It also won't work 
directly on any Teesny boards, nor any other non-328 chip 3rd party boards, 
unless they happen to have an AVR chip with PORT D pin 2 mapped to Arduino pin 
2.

The need for this optimization is as much a human factor as a technical one.  
While Arduino-provided examples all use digitalRead(), the reality is users of 
Arduino find examples from many 3rd party sources who will use direct register 
access to avoid the slow functions.  Only when this optimization becomes part 
of an official Arduino release will they begin teaching Arduino users to use 
the official API for these timing-critical examples.

Original comment by paul.sto...@gmail.com on 20 Sep 2012 at 8:33

GoogleCodeExporter commented 9 years ago
I've just submitted a pull request on github making another attempt at this. 
https://github.com/arduino/Arduino/pull/1285

Specifically trying to avoid the "complication of providing two separate 
versions of the pin mapping" (as per comment #12.)

There's some more discussion on the pull request itself.

Original comment by g...@projectgus.com on 19 Feb 2013 at 2:36

GoogleCodeExporter commented 9 years ago
Sorry it this is late or off-topic... From my experiments the best option seems 
to provide port-register address and pin mask as input parameter to the 
digitalRead/Write functions rather than computing it from pin number. 
For compile-time constant pin numbers this results in single instruction (or 
few instructions if interrupts need to be disabled) - same as the macro. 
For pins stored in a variable it takes about half of the time of digitalWrite 
now used in Arduino. 
It can be used in the same way as current Arduino functions (internally mapping 
the normal pin numbers to their 'pin codes' which contain the register address 
and mask). I have a working version for Arduino Uno and Mega (attached). There 
are more details in my article on Codeproject:  
http://www.codeproject.com/Articles/732646/Fast-digital-I-O-for-Arduino. 

Original comment by jdolina...@gmail.com on 2 Apr 2014 at 10:10

Attachments: