About the Ranade algo in edge...

I hadn't heard of the Ranade algorithm before, but it sounds similar to an NCO (numerically controlled oscillator) -- aka DDS (direct digital synthesis) -- phase accumulator. (For this analogy, I am ignoring the DAC & filter used in DDS to produce a sinewave output, and just looking at the high order bit rollover.) In essence, an NCO with a constant phase increment value produces a constant-frequency pulse train. A "linear sweep" NCO produces a constant acceleration pulse train. Linear sweep can be implemented as in the latest edge code, by bumping up the phase increment register at scheduled intervals. However, finer precision is available if the phase increment register itself is treated as an accumulator and updated by adding an acceleration value to it every cycle. The acceleration register, phase increment (velocity) register, and phase accumulator have to be fairly long words if the result is to be step-accurate over a long move.

I implemented something very like this in the independent-move code (limits.c: indep_increment()) in my branch. My registers are all 32 bits deep, which is probably overkill, and my stepper interrupt service is almost 54 usec compared to less than 30 usec for the grbl v0.8 code. This may not mean much, though, as I just measured that timing yesterday and have not tried to optimize anything.

My point is that if you implement a high-resolution NCO algorithm then I think you could dispense with the Breshenham line algorithm altogether. The structural simplification might be worthwhile. Perhaps jgeisler's simulator would be the perfect tool for testing whether the pure-NCO version is step-accurate for coordinated multiaxis moves.

Cheers, Chuck

Yep, the Pramod Ranade algorithm is pretty much an NCO. I will look into updating all the references to an NCO since it is already an established algorithm vs. something that was in an online magazine.

So, there are two primary reasons why the Bresenham algorithm is wedged into the Ranade/NCO algorithm for pretty mcuh the reasons you state.

First, the Bresenham algorithm is mathematically guaranteed to complete all steps for a multi-axis move, even with integers. This is because the step event counts/increment registers are automatically scaled to never having worry about round-off errors or guess what value a scaling multiplier needs to be to avoid them.

Secondly, in essence the Ranade/NCO algorithm is tracking the phase of the primary scaling axis of the Bresenham algorithm, so it's tracking and incrementing only one value per interrupt. This significantly reduces the computational overhead per interrupt. Most of the time all it has to do is increment one phase accumulator and check if it needs to increment the acceleration (no it doesn't have to do this every cycle, will explain later). So any time there isn't a step event, which is most of the time, it usually completes the interrupt in about 5 usec.

Anyhow, this doesn't mean that we can't make the Bresenham algorithm perform more independently phase correct like the Ranade/NCO algorithm does. All you need to do is track the Bresenham algorithm in a fractional phase, with a fraction that is a power of 2 (lowest cost). Suppose you bitshift multiply by two the phase accumulator increments, bitshift multiply by two the Bresenham triggering step_event_count, and leave everything else the same, you basically force the NCO algorithm to trigger every half period of the driving waveform, rather than a full period. You also shift the Bresenham counters (but still exact) to increment at half the rate, but allow the non-driving axes to trigger at any time. So they become more phase correct. If you keep multiplying the phase accumulator increments in the same way, you eventually approach the NCO algorithm as described.

I haven't installed this idea yet into the stepper algorithm, mainly because something like this will only work at step frequencies below 10kHz. The Arduino is simply not fast enough. I really wish it had double the cycles.

As for acceleration, since Grbl uses a constant acceleration profile, the slope of the velocity is linear with time. So just as long as you accurately trace this line via the midpoint rule with constant velocity segments, the reconstruction of the acceleration is exact. I did this to further reduce the computational overhead cost of having to check and increment the acceleration counter at each interrupt. Now it does them every 100 to 200 interrupts or so, which is tracked by an int8 vs an int32. (At least that's what it's supposed to be).

grbl / grbl

About the Ranade algo in edge... #150