MCUdude / MicroCore

A light-weight Arduino hardware package for ATtiny13
546 stars 87 forks source link

Accurate timing not implemented #2

Closed MCUdude closed 8 years ago

MCUdude commented 8 years ago

Accurate timing is not implemented yet. The delay() function is based on the AVR _delay_ms() function, and is highly inaccurate. The millis() function is also inaccurate, and one millisecond is actually (with a 9.6 MHz clock) (9600000/256)/37 = 1013.51 ms. the 13.5 ms should be taken into the count. I'm really grateful if someone will help me out with this! Have you any experience with these weird clock frequencies @SpenceKonde?

It's important that the new code is as small and optimized as possible. When given only 1024 bytes, every byte counts!

SpenceKonde commented 8 years ago

What I did for weird clock support is here: https://github.com/SpenceKonde/ATTinyCore/blob/master/avr/cores/tiny/wiring.c lines 221 - 267 - that has hand-chosen combinations of bitshifts to make micros() work (and delay in my core depends on micros(), like the official core) - and also some relevant comments.

millis() normally "just works" (it's delay and micros that get busted normally because of the integer division on line 105 in the official core: https://github.com/arduino/Arduino/blob/master/hardware/arduino/avr/cores/arduino/wiring.c ). It may be that some additional optimizations to save flash make millis worse. But in that case, you'd think the person who did that optimizing would have known what speed the '13 was clocked at, and planned for that...

MCUdude commented 8 years ago

Thanks! I'll dive into it!

I was just thinking; wouldn't it be possible to switch between dividing by 37 and 38 each time in the mills() function?

(9600000/256)/37 + (9600000/256)/38 = 2000,355618777, which is much closer

SpenceKonde commented 8 years ago

Re-read the issue again, and realized you already had it coming out to within 1.3%. The oscillator is only accurate to +/- 10%, this is silly.

37+(ovrf&1)? 

Most of my bitshift combos are worse than your 1.x % - though i think millis comes out better than that. It's a matter of how much you're willing to trade off for the accuracy. In your case, you've got very little flash space, and in my case, I needed to get micros() to execute fast enough to use. How much will you sacrifice at the altar of accurate timing?

MCUdude commented 8 years ago

You're right, the internal oscillator is the least precise part here. Actually, the most important task is to make delay() fairly accurate. Right now it's based on the air-libc' _delay_ms(), where 1 ms actually is more like 1.4ms.

SpenceKonde commented 8 years ago

Oh eew! Yeah, that's a problem.

What about

unsigned long start=ovrf; while (start-ovrf < delay*37) {;}

?

nerdralph commented 8 years ago

_delay_ms() and _delay_us() are accurate to the cycle, since early versions of avr-gcc 4.x. `

include <avr/io.h>

include <util/delay.h>

void main() { _delay_us(10); PORTB = 0; } `

With avr-gcc 4.9.2, -mmcu=attiny13 -DF_CPU=9600000, I get: 00000022 <main>: 22: 80 e2 ldi r24, 0x20 ; 32 24: 8a 95 dec r24 26: f1 f7 brne .-4 ; 0x24 <main+0x2> 28: 18 ba out 0x18, r1 ; 24 2a: 08 95 ret

Now I have found the RC oscillator to be quite variable on the tiny13, with some of them measuring as low as 9.4Mhz instead of 9.6.

I see github's "insert code" function sucks, but you should be able to get the idea even though the above code is messed up.

MCUdude commented 8 years ago

Thanks! I'll fire up my signal generator, set it to 9.6 MHz and test how "off" the timing really is with a proper clock source, before doing anything 😉

MCUdude commented 8 years ago

Today I hooked up my signal generator to pin PB3 on the ATtiny13, enabled the external clock fuse bit, and ran this sketch:

// Monitoring the switching speed on PB2 using delay()

void setup() 
{
  pinMode(2, OUTPUT);
}

// the loop function runs over and over again forever
void loop() 
{
  digitalWrite(2, HIGH);   
  //PORTB |= _BV(PB2); // <-- Same result

  delay(1);
  //_delay_ms(1);      //  <-- Same result

  digitalWrite(2, LOW);    
  //PORTB &= ~_BV(PB2);//  <-- Same result

  delay(1);
  //_delay_ms(1);      //  <-- Same result  
}

My setup looks like this: Setup 1 Setup 2 Setup 3

Here's a screenshot from my scope of the clock signal: Clock signal 9.6 MHz

And here's a screenshot of the actual signal out from PB2 (digital pin 2): Signal out

AS you can see from the readings in the second screenshot, delay(1) is actually 1.3 ms. The code was build using Arduino IDE 1.6.9, which uses AVR GCC v4.8 (IIRC).

nerdralph commented 8 years ago

I suspect F_CPU is incorrectly set. Try adding

define F_CPU 9600000

On Wed, Sep 21, 2016 at 10:23 AM, Hans notifications@github.com wrote:

Today I hooked up my signal generator to pin PB3 on the ATtiny13, enabled the external clock fuse bit, and ran this sketch:

// Monitoring the switching speed on PB2 using delay() void setup() { pinMode(2, OUTPUT); } // the loop function runs over and over again forevervoid loop() { digitalWrite(2, HIGH); //PORTB |= _BV(PB2); // <-- Same result

delay(1); //_delay_ms(1); // <-- Same result

digitalWrite(2, LOW); //PORTB &= ~_BV(PB2);// <-- Same result

delay(1); //_delay_ms(1); // <-- Same result }

My setup looks like this: [image: Setup 1] https://camo.githubusercontent.com/caa5f693078e675fc136cd2253f72549b7a0d35f/687474703a2f2f692e696d6775722e636f6d2f704161354e39692e6a7067 [image: Setup 2] https://camo.githubusercontent.com/7cfb6c1ebc5d493815f42f5808093e7c7636521e/687474703a2f2f692e696d6775722e636f6d2f4a38383355454a2e6a7067 [image: Setup 3] https://camo.githubusercontent.com/edb7070c2f60722c439baf8745fd1885343d13ab/687474703a2f2f692e696d6775722e636f6d2f475468494d65662e6a7067

Here's a screenshot from my scope of the clock signal: [image: Clock signal 9.6 MHz] https://camo.githubusercontent.com/5138994cd4e8fa46f3ae0dfd8529ac451584af26/687474703a2f2f692e696d6775722e636f6d2f35614f584c34762e706e67

And here's a screenshot of the actual signal out from PB2 (digital pin 2): [image: Signal out] https://camo.githubusercontent.com/f14e6a4864ae3261fbd1a66960ef943ffed0cbad/687474703a2f2f692e696d6775722e636f6d2f536a457575336c2e706e67

AS you can see from the readings in the second screenshot, delay(1) is actually 1.4 ms. The code was build using Arduino IDE 1.6.9, which uses AVR GCC v4.8 (IIRC).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MCUdude/MicroCore/issues/2#issuecomment-248610878, or mute the thread https://github.com/notifications/unsubscribe-auth/AFTc-jGc5tl56wHgXeiIBSQnHCwZg3Elks5qsS_pgaJpZM4KBTeN .

MCUdude commented 8 years ago

Nope, that didn't work either, still ~1.3 ms.

I tried to run the ATtiny13 at 4.8 MHz external clock too, and it's spot on 1.3 ms as well

nerdralph commented 8 years ago

for _delay_ms()? I could see delay() not changing since it's an external linked function.

I've been using _delay_us and _delay_ms for ages, and know they work accurately. Either something in your build environment is wrong or your MCU is broken. You could verify the external clock is working by enabling PWM output, and check that the timing is a proper division of 9.6Mhz.

On Wed, Sep 21, 2016 at 10:47 AM, Hans notifications@github.com wrote:

Nope, that didn't work either. still ~1.3 ms

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MCUdude/MicroCore/issues/2#issuecomment-248617589, or mute the thread https://github.com/notifications/unsubscribe-auth/AFTc-pY0XkR3qpV3rXienvKsLt81NTvPks5qsTWIgaJpZM4KBTeN .

MCUdude commented 8 years ago

The result is the same for _delay_ms as well. Here's a screenshot of the PWM signal from PB0 (Digital pin 0). I used the analogWrite function, analogWrite(0, 127);

PWM

MCUdude commented 8 years ago

Wait a second! applying cli(); in the setup, the pulse with is now exactly 1ms!

How can that be? millis/micros?

EDIT: Yes, it's the interrupt routine on timer0 thats causing the 0.3ms delay. Disabling mills from the core_settings.h also fixes the issue. Now how can the interrupt routine be optimized to make this delay a minimum?

nerdralph commented 8 years ago

Wait a second! applying cli(); in the setup, the pulse with is now exactly 1ms!

Well, if you have a terribly slow and poorly written millis() ISR (like the stock Arduino one), that could explain it.

I looked at optimizing millis() several months back and have some assembly code on the back of an envelope somewhere that was around an order of magnitude faster.

MCUdude commented 8 years ago

the ISR is actually really small, all the "processing" is done inside the millis() routine. Timer0 overflows every 256 clock cycle, which is a lot. Maybe it's a good idea to divide it down to cause fewer interrupts?

nerdralph commented 8 years ago

You're using no divisor/pre-scaler? I had a millis-like ISR down to ~20cycles (+register save restore overhead). Even that would give a 10% slowdown if it was run every 256 cycles. If your "small" millis ISR is adding 40% overhead, that would imply 102 cycles per ISR. Here's some info on how I did it if you want to try doing the ISR in assembler.

http://nerdralph.blogspot.ca/2014/07/writing-avr-interrupt-service-routines.html https://github.com/nerdralph/nerdralph/blob/master/avr/t32isr.S

p.s. for tiny devices, I think the WDT ISR for millis() is the best choice. Frees up the timer to be used for fast PWM.

MCUdude commented 8 years ago

Thanks! I have very little/no experience with assembler, so I'm afraid the code may turn into some magic lines I've not able to debug properly. Right now the ISR itself is just increasing a counter by one. I tried dividing the counter by 64, and it resulted in a PWM frequency of 585.94 MHz, which seems correct (9600000/256)/64 = 585.9375. After dividing the timer by 64, the test code output looks like this, and wiggles between 1.006ms and 1.014ms:

Square wave

MCUdude commented 8 years ago

As far as I know the problem by dividing by 64 is that it's not able to get "accurate" mills readings, since it only counts ~585 times a second. dividing by 8 gives an "update" frequency of 4687.5, and the pulse width of _delay_ms(1) is now 1.038 ms

nerdralph commented 8 years ago

As far as I know the problem by dividing by 64 is that it's not able to get "accurate" mills readings,

The trick is to get the additional timer resolution by reading the counter value, if you really need it. I think it would be hard to find a program that doesn't work if mills() is only accurate to within 2ms. But if you really want it accurate to 1ms, take the high-order bit from the counter value.

MCUdude commented 8 years ago

Getting 1ms accuracy out of mills() isn't actually a goal in itself. The most important task is to get delay() fairly accurate (which we manage to do), and still have a reasonable millis value.

Let's say I stick with a prescaler of 64. Thats ~586 overflows per second at 9.6 MHz. How can I get an accurate millis() reading out of this number (without having to handle floats)? If I multiply it by 2, I get 1172, which is 172 ms off one second.. As you might have understood I want both millis() and _delay_ms() to be fairly accurate, while the default PWM frequency is "high enough" for general purpose stuff like DC motor control and LED fading without flickering.

MCUdude commented 8 years ago

p.s. for tiny devices, I think the WDT ISR for millis() is the best choice. Frees up the timer to be used for fast PWM.

This is an interesting idea. I haven't used the WDT much, but doesn't this timer reset the microcontroller when it's overflowing? Can we just trigger an ISR and increase a counter instead?

Thank you for all your help so far! 😃 Help like this is priceless. Oh! Do you know any good tutorials for learning AVR assembler? I'm mostly a C/C++ guy, but "knowing your roots" is always a good idea when it comes to time critical applications.

nerdralph commented 8 years ago

Let's say I stick with a prescaler of 64. Thats ~586 overflows per second at 9.6 MHz. How can I get an accurate millis() reading out of this number (without having to handle floats)? If I multiply it by 2, I get 1172, which is 172 ms off one second.

You can correct for it in the ISR. One way would be to add 2 most of the time, but some of the time add one.

Doing it in the millis() function, f you want to multiply by 1000/586 = 1.7065, you could use fixed-point math, but I think the smallest code would be to hard-code the multiply with shifts & adds. For example adding 1/2 of the counter value + 1/4 of the counter value is the same as multiplying by 1.75 but with a few simple integer operations. If you add 1/2 + 1/8th + 1/16th, you are multiplying by 1.6875. If you want to get really close, adding another 1/64th is the same as multiplying by 1.703.

nerdralph commented 8 years ago

I haven't used the WDT much, but doesn't this timer reset the microcontroller when it's overflowing? Can we just trigger an ISR and increase a counter instead?

Yes, You can enable the watchdog interrupt instead of watchdog reset. The resolution is lower (15ms is the minimum interval I think), but often that's good enough for most programs.

As for AVR assembler, I can't point to any tutorials, since I just learned it from the AVR assembler instruction set reference, along with disassembling some C code (avr-objdump -d). I learned 8-bit assembler (6502) before I learned C and C++, so picking up AVR assembler was pretty easy.

nerdralph commented 8 years ago

You may also want to look at my picoWiring project. It's been on the back burner, so there's a lot unfinished. I had been planning to implement a lightweight millis() with a WDT ISR in assembler, but got distracted with other things. Having delay() put the AVR in sleep mode is another idea, which would be very useful for battery-powered applications. http://nerdralph.blogspot.ca/2015/10/beta-picowiring-arduino-compatible.html

MCUdude commented 8 years ago

I played around with the WDT today, and I'm able to reduce the code size dramatically by using the WDT instead of timer0. I'll push the new code later today.

I have another question though: the code I forked had the delay function made like this.

void delay(uint16_t ms)
{
    while(ms--)
        _delay_ms(1); 
}

why not write it like this instead?

void delay(uint16_t ms)
{
    _delay_ms(ms); 
}
MCUdude commented 8 years ago

This issue is actually solved 😃