gin66 / FastAccelStepper

A high speed stepper library for Atmega 168/328p (nano), Atmega32u4, Atmega 2560, ESP32, ESP32S2, ESP32S3, ESP32C3, ESP32C6 and Atmel SAM Due
MIT License
301 stars 70 forks source link

Long step times are less accurate than short ones #8

Closed arion-p closed 3 years ago

arion-p commented 3 years ago

If step time is longer than 65536 ticks, the actual step time is not tick accurate. The longer the step time the less accurate it gets. This is because the step time is approximated using a period time and a repetition count. In most cases this is not an issue since the error is small compared to the step time. However in some cases tick accurate steps are needed (e.g. in my case, a barn door star tracker), otherwise the error accumulates and significant drift is introduced.

An alternative solution is to compose the step time using N fixed periods (each 65536-2*MIN_DELTA_TICKS long) plus any remaining ticks. I have implemented this solution for the AVR but have no idea how to do it for esp32 (so I cannot do a PR).

The downside to this is that the maximum step time is somewhat shorter (in the case of 16MHz AVR 1.034 instead of 1.044 seconds)

gin66 commented 3 years ago

Thanks for the proposal. Similar solution was implemented in e.g. version 0.2.3. There I have used T_fixed = 16384 and T_var being the remaining value. The number of fixed periods had been adjusted, so it was ensured, that T_var >= 16384. Has worked very well.

If I understand your proposal correctly, then the "remaining ticks" could even go down to e.g. 1, which cannot be handled by interrupts. Example: requested ticks = 2(65536-2MIN_DELTA_TICKS)+1. Right ?

While adding support for esp32, using just N*T_period has led to simpler code and less modifications of registers. The sacrifice of being cycle accurate was ok for my use cases.

Anyway, after revisiting the avr and esp32 code, I think it should be easy to re-implement the former scheme for both architectures. Just I would change to 32768 as fixed value via preprocessor constant.

Required changes:

Just need some time to implement and test.

And yes, the maximum step time will be halved. Just a different sacrifice......for now.

arion-p commented 3 years ago

OK Let's stick to T_fixed (fixed period ticks = 65536-2*MIN_DELTA_TICKS), T_var (remaining ticks) and N (number of fixed periods) terminology.

If I understand your proposal correctly, then the "remaining ticks" could even go down to e.g. 1, which cannot be handled by interrupts. Example: requested ticks = 2(65536-2MIN_DELTA_TICKS)+1. Right ?

No, if T_var is less than MIN_DELTA_TICKS, N is decremented by one and T_var is incremented by T_fixed, so that it can be handled by interrups.

As you noted T_fixed could be set to 32768 instead of (65536-2*MIN_DELTA_TICKS) to remove the need for division/modulo in _addQueueEntry(). Only downside is it further reduces the maximum step time to around 0.5 seconds.

In any case it would be easy to introduce a preprocessor define to switch between the two options (longer step times or faster _addQueueEntry())

You can check my implementation in my fork's accurate_steps branch (if you haven't already done so)

gin66 commented 3 years ago

Now I have looked at your fork and better understand your proposal.

Switching the behavior by a preprocessor switch (by compiler flag !?), i prefer to avoid for better maintainability. Based on your code, I can update the esp32 implementation. Hope I can do this the next days.

gin66 commented 3 years ago

Change has been implemented as 0.8.1, but not yet tagged. This will be done after test passed.

gin66 commented 3 years ago

Have checked with AVR and still works as expected. Unfortunately my oscilloscope is too low cost, so I have currently no idea, how to verify, if it is cycle accurate. Measuring the runtime for x steps with period time e.g. 4096us vs 4097us will yield a time difference of less than 1s per hour.....

devrim-oguz commented 3 years ago

@gin66 what do you need to check? Maybe I can measure it with the oscilloscope at my work, if you tell me exactly what to measure.

gin66 commented 3 years ago

@devrim-oguz Thanks for your offer.

The latest StepperDemo outputs for each motor an info like this: M1: MANU Curr=10000 QueueEnd=10000/0µs Target=10000 STOP =IDLE The key information is the 0µs. If it is not equal to 0 (stopped motor), the stepper pulse time (at queue end) is displayed. (using µ has been a bad choice, because e.g. cutecom displays <0xc2><0xb5> instead. On latest github version, this has been changed to us, already).

A test case would be: M1 A10000 V10000 R10000 With this acceleration, all pulses will be with 10000µs. My oscilloscope is not able to detect the 10µs pulses with 10ms distance. So either the distance between pulses or the frequency is needed.

The old code would yield: V10000 => 10000µs V10001 => 10000µs V99984 => 99984µs V100007 => 99984µs V100008 => 100008µs V255938 => 255937µs V260000 => 255937µs V260001 => 260001µs

Measurements around those values would be of interest, if displayed value in console matches with the stepper pulse frequency.

BTW with Arduino nano (not crosschecked on esp32), I observe some strange effects:

Despite the root cause of those anomalies, I do not expect a relation with the cycle accuracy of the step generation. The anomalies do not appear on pc simulation, so not so easy to find and fix.

devrim-oguz commented 3 years ago

My scope isn't super but I think these measurements can be taken. I will try to do it best to my understanding. Maybe you can also use a frequency counter (like the freuency measurement mode of multimeters) to measure that signal?

devrim-oguz commented 3 years ago

I noticed that those are quite precise numbers. My oscilloscope may not be able to differentiate between them either.

devrim-oguz commented 3 years ago

Or maybe you can measure the pulse timings with an interrupt on a microcontroller pin.

gin66 commented 3 years ago

frequency counter is even better, which I do not have unfortunately.

gin66 commented 3 years ago

For the anomalies, the root cause is IMHO here:

void RampGenerator::setSpeed(uint32_t min_step_us) {
  _config.min_travel_ticks = min_step_us * (TICKS_PER_S / 1000L) / 1000L;
  update_ramp_steps();
}

If the speed is around 260000: then the multiplication with TICKS_PER_S/1000L=16000 will overflow 32bit. The pc-version perhaps extends to 64bit temporarily. Overflow happens from 268436.

gin66 commented 3 years ago

Fixed in version 0.8.3

gin66 commented 3 years ago

Just finished an experiment with two NEMA-17 steppers using StepperDemo: M1 A1000 V8000 R1000 K M2 A1000 V8000 R1000 K

This produces constant sound from the motors. increasing M2: M2 A1000 V8001 R1000 K M2 A1000 V8002 R1000 K M2 A1000 V8003 R1000 K M2 A1000 V8004 R1000 K

The constant sound changes and starts to get an increasing modulation frequency

Modulation sounds similar for these two off by +/-10us speeds: M2 A1000 V8010 R1000 K M2 A1000 V7990 R1000 K

=> From this I conclude, that cycle accuracy (at least for 8000us aka 96000 ticks) works

devrim-oguz commented 3 years ago

From what my oscilloscope can measure, pulses are 30 ms away from each other while using M1 A10000 V10000 R10000 and M1 A10000 V10001 R10000 it might be missing pulses though. Better to use an interrupt on a MCU to measure this

devrim-oguz commented 3 years ago

From measuring with an interrupt pin on an Arduino UNO:

V10000 => ~9960-9964µs V10001 => ~9964µs V99984 => ~99652µs V100007 => ~99676µs V100008 => ~99676-99680µs V255938 => ~255096-255100µs V260000 => ~259144-259148µs V260001 => ~259144-259148µs

I used the code:

uint32_t duration;

void setup() {
  pinMode(2, INPUT);
  Serial.begin(115200);
  attachInterrupt(0, pulseMeasure, RISING);
}

void loop() {
  Serial.print("Pulse Duration: ");
  Serial.println(duration);
  delay(700);
}

static uint32_t lastMeasurement = micros();

void pulseMeasure() {
  duration = micros() - lastMeasurement;
  lastMeasurement = micros();
}

Also the library was running on an Arduino UNO. Given the 4us resolution of arduino, I think library works accurately. But I think the best test can come from that barn door star tracker 🤷‍♂️

devrim-oguz commented 3 years ago

From an ESP32, although the interrupts are delayed, I got a much better measurement:

V10000 => ~10005-10006µs V10001 => ~10006-10007µs V99984 => ~100045-100046µs V100007 => ~100068-10069µs V100008 => ~100069-10070µs V255938 => ~256097-256098µs V260000 => ~260161-260162µs V260001 => ~260163-260164µs

again the stepper signal is generated from an Arduino UNO.

I think your new code works accurately, however it might be pushing steps too early given that the interrupt response of Arduino UNO which is much faster than the ESP32. If it falls below the desired time in Arduino UNO's interrupt, it might actually be shorter than expected, but relatively linearly scalable.

gin66 commented 3 years ago

Thanks for your strong support with these measurements.

For esp32, I plan to adjust the high time from fixed 10us to towards 50% duty cycle of current set speed. Then I can do measurements with my cheap oscilloscope, too. Anyway, your conclusion of the new code working accurately confirms, that the code rework was successful.

For avr: apparently the code rework was successful in regard towards the original problem from arson-p, too. The resolution is much better. Currently no explanation, why the pulses are approx 0.4% too fast. If this is due to µC oscillator tolerance of the two used micros ? Even I think 0.4% is too much for that being the root cause.

devrim-oguz commented 3 years ago

You can try to do better measurements by manually using a timer of the arduino to measure the time between. This was just quick and dirty. But I also think that steps are coming too early even for an error. I will try to do better measurements if I find the time to do so.

devrim-oguz commented 3 years ago

Hey @gin66 I'm sorry, I cross checked the timings by switching the codes uploaded to each of the Arduinos, now the other one shows that steps are 0.4% too slow. I think this is caused by one of my Arduinos being a cheapass clone with a ceramic oscillator. Nevertheless, I got much more precise measurements of your library now, averaged about 100 samples in the nanoseconds range using the Arduino Timer1 (Please ignore the frequency drift caused by the cheapass Arduino);

V10000 => ~10029.95µs V10001 => ~10030.90µs V99984 => ~100307.58µs V100007 => ~100327.50µs V100008 => ~100328.60µs V255938 => ~256762.10µs V260000 => ~260837.3µs V260001 => ~260838.7µs

Here is the code I used for Arduino UNO:

#define AVERAGE_SAMPLES 100
#define INTERRUPT_PIN 2

volatile uint32_t totalTime = 0;

void setup() {
  noInterrupts();             // disable all interrupts
  TCCR1A = 0;
  TCCR1B = 0;                 // timer registers cleanup
  TCCR1C = 0;
  TCNT1  = 0;
  TCCR1B |= (1 << CS10);      // 1 prescaler 
  TIMSK1 |= (1 << TOIE1);     // enable timer overflow interrupt
  TCNT1 = 0;                  // set timer inital value to 0
  interrupts();               // enable all interrupts

  Serial.begin(115200);

  pinMode(INTERRUPT_PIN, INPUT);
  attachInterrupt(digitalPinToInterrupt(INTERRUPT_PIN), pulseMeasure, FALLING);
}

uint32_t timeDifference = 0;

void loop() {

  double averageSum = 0;

  for( int i = 0; i < AVERAGE_SAMPLES; i++ ) {
    averageSum += (double)timeDifference * 62.5;
    delay(30);
  }

  double nanosecondsDuration = ( averageSum / AVERAGE_SAMPLES );

  Serial.print("Averaged Pulse Period over ");
  Serial.print( AVERAGE_SAMPLES );
  Serial.print(" Measurements: ");
  Serial.print(nanosecondsDuration, 0);
  Serial.print(" nanoseconds [+- 62.5 ns] -> (");
  Serial.print((nanosecondsDuration / 1000), 3);
  Serial.println(" us)");
  delay(700);
}

ISR(TIMER1_OVF_vect) { // Interrupt service run when Timer/Counter1 OVERFLOW
  totalTime += 65535; //2 power 16
}

inline uint32_t getTicks() {
  return totalTime + TCNT1;
}

void pulseMeasure() {
  static uint32_t currentTime = getTicks();
  timeDifference = getTicks() - currentTime;
  currentTime = getTicks();
}

You can do similar measurements using this in the future. It is quite precise in the order of nanoseconds (depending on your crystal). Value stabilizes after the first few averages.

Both the measurement code and the stepper code runs on Arduino UNOs.

gin66 commented 3 years ago

Hi @devrim-oguz. Again thanks for your great support and effort.

I have read, that the 328p can be configured for using internal RC-Oscillator or external crystal. The configuration is done by fuses, which can be read/written only by a programmer device. I have not seen, that those values can be read via SW code. Anyway, it could be, that one of your devices is configured for using RC-oscillator and not the crystal. This configuration seems to be the default of a brand new 328p IC.

Anyway, what I really like of your new measurement data, is, that a change of 1us in speed, can be seen reliably in the measurement data as approx. 1us difference, too. So the code really allows cycle accurate changes of pulse time.

devrim-oguz commented 3 years ago

No both of mine are running from a crystal input but I discovered that one of them uses a 16 MHz crystal where the other uses a 16 MHz ceramic oscillator in place of it. Glad if I could help.

devrim-oguz commented 3 years ago

You are right about the fuses, I didn't change any oscillator settings. I only used internal timers to measure a precise time of the signal. Both of those boards were default arduinos but one was cheaply manufactured. No settings were changed in software.

gin66 commented 3 years ago

The latest version, outputs for esp32 step pulses with 50% duty cycle or maximum ~2ms high pulse. Now I can use my oscilloscope. Looks really good :-)

gin66 commented 3 years ago

stale issue