Traumflug / Teacup_Firmware

Firmware for RepRap and other 3D printers
http://forums.reprap.org/read.php?147
GNU General Public License v2.0
312 stars 199 forks source link

ACCELERATION_TEMPORAL #233

Open Wurstnase opened 8 years ago

Wurstnase commented 8 years ago

Hi all,

I'm currently working on ramps for acceleration temporal and before I forget this I write it down.

I found an issue in timer_set(). The part (current_counter - match_last_counter) can be negative.

So I changed this part on STM32F411 to:

      int32_t check_timer;
      check_timer = (TIM5->CNT - TIM5->CCR1) + 100;

      if (check_timer > delay) {
        return 1;
      }

The LPC should have something like:

      int32_t check_timer;
      check_timer = (LPC_TMR32B0->TC - LPC_TMR32B0->MR0) + 100;

On the simulator I had no issues, but maybe the AVR needs also a change.

phord commented 8 years ago

It seems like this would be fixed more easily by making this change:

uint32_t check_timer;
check_timer = (uint32_t)TIM5->CNT - (uint32_t)TIM5->CCR1; 

I don't know the size of TIM5->CNT and TIM5->CCR1, though. Presumably they are already unsigned. Maybe the casts forcing them to uint32_t are unnecessary.

This works because when you subtract a larger number from an smaller one in an unsigned context, the result overflows. We discard the overflow by using the unsigned holders and we are left with the correct difference.

So, to use an 8-bit example for simplicity, suppose we have

#include <stdio.h>
typedef unsigned char uint8_t;
uint8_t a = 250;
uint8_t b = 3 ;
int main(void) {
    printf("%d  %u  %u\n", b-a , b-a, (uint8_t) b-a );
}

The output is this:

-247  4294967049  9

9 is the answer we want, which is the same as 3 - (250-256) = 3 + (-250 + 256) = 3 + 6.

Wurstnase commented 8 years ago

No, just test it. This won't work. In some cases the CCR1 is that much bigger, that (uint32_T)CNT - (uint32_T)CCR1 + 100 > delay. But in this case we don't have a short move.

Wurstnase commented 8 years ago

I need some time for cleaning the mess on my desktop. But I will upload the code soon. :)

screenshot at 2016-08-01 18 48 50 screenshot at 2016-08-01 18 49 16

Traumflug commented 8 years ago

Hey, that's ACC_TEMPORAL with actual acceleration! Excellent!

phord commented 8 years ago

I don't understand your response. If CCR1 is the time of the previous step, and CNT is the current timer, then (unsigned)(CNT - CCR1) should be the time since the last step. The only exception is if the timer has looped all the way around and it's actually wrong by some multiple of the wraparound size. In that case, having a negative result is not helping; it's merely hiding the problem.

Specifically all I am suggesting is that check_timer should be unsigned (non-negative). This makes sense because the time since the last step can never be negative. Being unsigned gives us the extra help from the compiler to handle the wrap-around for us.

Wurstnase commented 8 years ago

Yes, you are right. I have some ideas. The problem seems to be occur only at the first step. Thanks for pointing me in the right direction!

Wurstnase commented 8 years ago

Looks like the issue is only on my STM. When CCR1 is <0xFFFFFFFF then I always get a short move. When I setup the CCR1 in the timer_init() to CCR1 = 0xFFFFFFFF everything is fine.

phord commented 8 years ago

How is CCR1 > 0xFFFFFFFF? Is it a 64-bit variable?

It sounds like you have found the problem there; after looking at the code I'm not sure what your change here proposes to do. It seems to just move this math into a temporary variable. What am I missing?

Wurstnase commented 8 years ago

How is CCR1 > 0xFFFFFFFF? Is it a 64-bit variable?

Ah sorry. Correct it to '<'.

Wurstnase commented 8 years ago

I opened a new branch: https://github.com/Traumflug/Teacup_Firmware/tree/acc_temporal

This is currently on experimental + some stm32 and other stuff. I will rebuild this later on the current experimental.

Because the current code is a bit slow, I want to change the dda->c calculations like acceleration_ramping, only every 1-2ms. Achievable steprates are somehow 50kHz (maybe 60kHz) on my STM at 96MHz. Not that good...

Wurstnase commented 8 years ago

Uh... Not that slow as I thought. The issue is different.

dda->step_interval[axis_to_step] = 
  dda->step_interval[axis_to_step] - (dda->step_interval[axis_to_step] * 2) \
  / ((4 * (dda->delta[axis_to_step] - move_state.steps[axis_to_step])) + 1);

This part can't get smaller than 2693 with 1280 steps/mm, acc 2000 and f_cpu 96MHz. The step_interval of the first step is c0 * 0.676. (Equation 15)

c0 = 114683
c1 = 68810
...
c1346 = 2694
c1347 = 2693
c1348 = 2693
...

because:

c1347 * 2 / (4 * 1347 + 1) = 0!
2693 * 2 = 5386
4 * 1347 + 1 = 5389
Traumflug commented 8 years ago

First, if you're concerned about step rates, you should do all these calculations in dda_clock(). dda_step() cares about creating the next step according to given speeds, not on what this speed should be.

Regarding acceleration calculation: how about simple school physics:

speed = acceleration * time

and

step delay = 1 / speed

Acceleration is given, speed is known and approximate time can be found in move_state.last_time. That's the time of the most recent step on any axis. Hopefully precise enough for acceleration calculations, there could be issues at very low speeds/acceleration start.

Constant speed area is reached when acceleration calculation results in a value smaller than dda->c_min.

For deceleration it's the same calculation as with acceleration, but with time = move_duration - move_state.last_time. move_duration isn't currently stored in DDA, but that can change, of course. It's also calculated without acceleration in mind, which would have to change, too.

The remaining challenge is to get units right. We have speed in mm/min, acceleration in mm/s² and time in 1/F_CPU. And each intermediate result has to fit into 32 bits.

With all this, speed of the print head is known. Speed for single axes can be found by scaling. Conveniently there's muldiv(), which isn't cheap (some 750 clocks), but at least safe against intermediate overflows.

phord commented 8 years ago

If it helps, I don't think you need to consider acceleration in move_duration. This is specifically because accel and decel are symmetrical. Consider the graph of speed vs. time. Speed increases during ramp_up, cruises at c_min for some cruise time, and then decreases during ramp_down. Symbolically we say this:

Ts = ramp_up time
Tc = cruise time
Td = time to begin decelerating (Ts + Tc)
Te = total move duration

For example, let's say Ts is 2 seconds and Tc is 5 seconds. This means we accelerate for two seconds before reaching cruise speed, then we cruise for 5 seconds. Since we accelerate and decelerate symmetrically, we find that Ts is also the amount of time it takes to decelerate to zero. So,

Ts = 2
Tc = 5
Td = 7 (calculated "move_duration")
Te = 9 (actual move duration)

Te = Ts + Tc + Ts = 2*Ts + Tc

Presumably move_duration is calculated as if we had no accel or decel. This assumes we begin our move at c_min (target speed) and maintain that for the duration of the move. It turns out that

move_duration == Td == Ts + Tc
Also: Ts = Te - Td

This is because if you add the speeds during accel and decel at isometric relative moments in time, you will find they always add up to our maximum speed (1/c_min). Mathematically, I can say this:

Given Vc = v(Ts) = v(Tc) (cruise velocity),
     For all 0<= x <=Ts, v(x) + v(Td + x) = v(Tc)

I can draw pictures if it is still confusing. It really confused me when I first read this concept. But I think it can be very useful when we are doing time-based acceleration profiles.

Wurstnase commented 8 years ago

Traumflug:

you should do all these calculations in dda_clock()

Wurstnase:

only every 1-2ms

Sure :)

School physics looks like simple, but currently I can't see any benefit:

        // accelerating
        // speed = acceleration * time
        // dda->step_interval = 1 / speed
        // move_state.last_time in ticks
        // time = move_state.last_time / F_CPU [s]
        // acceleration [mm/s²]
        // speed = acceleration * move_state.last_time / F_CPU -> [mm/s]
        // speed_step = speed * STEPS_PER_M_X -> [(steps * mm) / (m * s)]
        // dda->step_interval = 1 / speed_step -> [(m * s) / (steps * mm)]
        // dda->step_interval = 1000 / speed_step -> [(1000mm / m) * (m * s) / (steps * mm)] -> [s / steps]
        // dda->step_interval = F_CPU * 1000 / speed_step -> [ticks / steps]
        // dda->step_interval = F_CPU * 1000 / (speed * STEPS_PER_M_X)
        // dda->step_interval = F_CPU * 1000 / (acceleration * move_state.last_time * STEPS_PER_M_X / F_CPU)

        dda->step_interval[axis_to_step] = muldiv(F_CPU, 1000, muldiv(ACCELERATION * STEPS_PER_M_X, move_state.last_time, F_CPU));

vs:

dda->step_interval[axis_to_step] = 
  dda->step_interval[axis_to_step] - (dda->step_interval[axis_to_step] * 2)
  / ((4 * (dda->delta[axis_to_step] - move_state.steps[axis_to_step])) + 1);
Wurstnase commented 8 years ago

@ me Not that big issue. Precalculate F_CPU * F_CPU / (ACCELERATION * STEPS_PER_M_X) and then it's just:

static const axes_uint32_t PROGMEM temporal_const_P = {
  (uint32_t)((double)F_CPU * F_CPU * 1000 / ((double)(STEPS_PER_M_X) * ACCELERATION)),
  ...
}

dda->step_interval[axis_to_step] = pgm_read_dword(&temporal_const_P[axis_to_step]) 
                                   / move_state.last_time;

For the none-fast-axis we can precalculate a factor with delta_total / delta[axis] delta[axis] / delta_total.

For low acceleration/step-rates and high F_CPU we should move the 1000 out of precalculation. 96MHz^2 * 1000 / 40.000 / 500 > 32bit

Traumflug commented 8 years ago
static const axes_uint32_t PROGMEM temporal_const_P = {
  (uint32_t)((double)F_CPU * F_CPU * 1000 / ((double)(STEPS_PER_M_X) * ACCELERATION)),
  ...
}

Unless I'm mistaken, you apply the same acceleration to each individual axis here. As we know, participating axes have to move at different speeds depending on movement direction. Giving all of them the same acceleration means acceleration on slow axes is finished earlier than on fast axes, which puts them out of sync.

All axes have to finish acceleration at the same time to keep synchronisation (and movement direction). As far as my own considerations go, this requires to calculate acceleration along the movement direction first, then to derivate individual axes from there. Along movement direction there is no explicit STEPS_PER_M, it should be fine to simply assume one (e.g. 1000 steps/mm). The assumption eliminates when scaling to the individual axes.

Wurstnase commented 8 years ago

Unless I'm mistaken, you apply the same acceleration to each individual axis here.

The math is really simple in it. Take the fast axis. Accelerate it by its constant. Accelerate other axis by a factor of delta[axis]/delta_total * constant of fast axis.

So acceleration of fast axis is maybe 2000. delta_total = step_count of fast axis. Maybe 1000. delta_nonfast_axis = 200. So accelerate this axis with 200/1000 * 2000 = 400. Simple, isn't it?

@phord Need to read again your thesis. But can this handle lookahead with different accelerating and decelerating timings?

phord commented 8 years ago

I think it can, but I haven't tried it at all. I'm hoping to use it for non-constant acceleration someday. Ideas from equation 8 in this paper on exponential motion planning.

Traumflug commented 8 years ago

@Wurstnase

Accelerate other axis by a factor of delta[axis]/delta_total * constant of fast axis.

I see. You want to derivate the other axes from the fastest one, similar to what ACC_RAMPING does. Sounds good!

@phord Ah, this paper from Bath :-) Yes, sounds very plausible. The curves in Fig. 5 lower half look a bit scary (sharp velocity corners!), but I think the idea is that if one adds two overlapping movements up, one gets a flat curve. A similar paper, way more detailed, is here: http://www.dct.tue.nl/New/Lambrechts/DCT_2003_18.pdf

I miss one thing with all these trajectory planning ideas: how do they do error correction? AFAIK, LinuxCNC does error correction with PID (like temperature control), even for steppers. PID can take action only after an error happened. The Bath paper says nothing about error correction. Teacup currently does error correction at movement endpoints (aka "don't stop moving unless all steps are done"). These advanced planning strategies make these endpoints going away.

We don't need V(t), but Vx(x) and Vy(y), after all.

phord commented 8 years ago

I didn't mean to start this discussion here. I only found it interesting in the context of Td = dx / Vmax despite acceleration, which I first learned from studying this paper.

The way PID stepper drivers work is that they accumulate motion piecewise by simple addition over time. If they find their step came too late to hit the calculated movement time, then this is an error which is compensated for by the PID loop. The compensation comes in the form of decreasing the next step delta, causing the acceleration to increase.

So you choose a predicted first step time and then you periodically adjust acceleration (linearly), accumulate velocity (geometrically), and accumulate position, the sum of the discrete velocity averages. At some point you will find that your position "steps", from rounding down to X to rounding up to X+1. This is when you should have stepped. If you stepped 300us ago or 500us from now, then this is your error value.

I think the paper is agnostic about error feedback. Maybe it is measured with a servo pot or some other positional indicator, or maybe it is calculated like the stepper-PID idea. The primary idea in the paper is about constraining jerk to a continuous function across the whole movement in order to minimize actual errors.

The theory about movement joining of symmetrical accel/decel phases still interests me, but you are right that it is further complicated by the separation of axes. It would be easier to do with ACCELERATION_TEMPORAL where we can use Vx(t) and Vy(t), but this feature has been abandoned for a long time. It's interesting to see it come up again.

Traumflug commented 8 years ago

Learned something today, from the Lambrechts paper (the same I linked above already). This is from page 26: integrator chain To the left is a "generator", which outputs 1, 0 or -1. This "signal" is stuffed into the first integrator at discrete time steps, e.g. every 2 milliseconds. The output of that integrator is the integral of the original signal, which happens to be jerk.

This jerk signal (now any value in the range {-1, 1}) is put into another integrator, which integrates up again.This gives acceleration. Yet another integrator in the chain gives velocity. The last integrator gives position, and that integrator happens to be not some mathematical formula, but our stepper motor.

The non-trivial part here is to calculate the original signal. There's a matlab script at the end of the paper to calculate the signal in multiples of an adjustable time step. This script even takes error correction into account. But that's not my point here.

My point is, if two of these integrators get chopped off, we have second order acceleration, which is what Teacup currently does. Then there's a simple acceleration signal: 1 = acceleration, 0 = keep speed, -1 = decelerate. When to give which signal is easy to calculate.

Now, "forward Euler discrete time integrators" sounds impressive, doesn't it? Impressive name, simple math :-) It's as simple as velocity = velocity + signal, done at each time step. "At each time step" = once in dda_clock().

The magnitude of the signal is simple, too: it's the speed change happening from one time step to another: signal = dv = acceleration * 2ms. A constant value.

To sum up: instead of doing complex math calculations, one can simply add up velocities at each time step. Only for the adventurer: third order acceleration (constant jerk) would mean two such additions at runtime, fourth order acceleration three such additions. In any case: all the complex math is done at preparation time.

phord commented 8 years ago

Nice! I tried to write up some similar descriptions before. The math is so simple but it still only gives us V(t). We still must calculate 1/V(t) to be useful. But maybe we can afford to do that every 2ms, huh?

Traumflug commented 8 years ago

2 ms = 32'000 clocks on the slowest controller. Perhaps 10'000 clocks taking the time for step generation and G-code parsing into account. One 32-bit division = 600 clocks, IIRC. Much less on Wurstnase's F4, of course. And with some luck this 1/V and scaling to individual axes can be done with one division.

phord commented 8 years ago

To take this two steps further:

  1. It would be trivial to extend this to permit variable acceleration. Simple linear acceleration gives smoother velocity curves.
  2. A linear approximation can be used to map the exponential graph from the Bath paper. I've bashed some code to experiment with this off and on. In fact that's what I was doing when I discovered #215
phord commented 8 years ago

Sorry -- Didn't mean to close it. But I wonder, @Wurstnase, should we close it?

Traumflug commented 8 years ago

To keep my head from exploding I started writing a wiki article: http://www.reprap-diy.com/printer_controller_trajectory_planning

Traumflug commented 8 years ago

should we close it?

Closing? For my part I just warmed up with the topic.

Closing can be done when acceleration with ACC_TEMPORAL is working for all axes and some prints of PCB millings were done. Ideally with lookahead already.

Wurstnase commented 8 years ago

This part of Teacup could become some kind of a unique feature in the current reprap community. I just started with it. A first fast duty test with temporal/schoolphysics/dda_clock only for acceleration works great.

I really fell in love in that part of code. Later I could add for the F4 PWM for each stepper. An interrupt could count the steps. With that I guess I can reach much faster rates because the interrupt will be very simple. Just count the pwm and stop it until finish.

phord commented 8 years ago

Oh, I didn't realize this is a general "Implement ACCELERATION_TEMPORAL" issue. I thought the first comment about a suspected bug in timer was the real issue.

Three items come to mind about your wiki.

  1. You mention calculating new math "every other microsecond", but I think you mean millisecond.
  2. It is not necessary to calculate four orders of equations at runtime in order to achieve the same result. The Bath paper looks at this specifically. The key feature of that article is that a formula is proposed which results in smooth f, f' and f'' (velocity, acceleration, jerk), which provides direct formulae for all three, and which bounds each function to desired limits. Thus using only the velocity formula from that paper and ensuring the alpha value is chosen to respect the limits on the other two graphs is enough to get a smooth curve on all three levels.
  3. It is reasonable and useful to combine both Non-linear acceleration and Velocity Calculation at Discrete Time Intervals to achieve a smoother hybrid. Let me explain:

Velocity Calculation ... works by calculating a new step-time (velocity) for the current desired velocity every 2ms. Let's call this target. Ideally this is the average speed we want to achieve for the 2ms period, so perhaps it is based on V(now+1ms) instead of simply V(now).

target = V(now + 1ms)

In either case, we use this step interval for 2ms until we run again to calculate a new value for target. Suppose instead we calculate two values called target and next.

target = V(now)
next = V(now + 2ms)

Now next holds the velocity we want to use at the end of this interval. Rather than using a fixed step value during this interval, we can adjust the step value REPRAP-style by simply adding some constant after each step to target so we end the interval at target = next. We can calculate this constant (the slope from target to next over 2ms) easily enough.

Importantly it does not overly complicate our 2ms math by requiring us to do twice as much work. We only need to calculate both target and next at the start of our move. And then each subsequent 2ms calculation becomes simply this:

target = next;   // re-use previously calculated value
next = V(now + 2ms)

Except we do also have to calculate our slope, too:

steps_this_interval = 2ms / (( next + target ) / 2)
slope_this_interval = (next - target) / steps_this_interval

But I think this might simplify further:

slope_this_interval = (next - target) / steps_this_interval
                    = (next - target) / (2ms / ((next + target)/2))
                    = (next - target) / 2ms * ((next + target)/2)
                    = (next - target) * (next + target)/2 / 2ms
                    = (next*next - target*target) / 4ms

Weird.

Conveniently this model (using target and next) fits well into a method which uses a linear-approximation of some complex velocity curve via a lookup table.

Traumflug commented 8 years ago

I think you mean millisecond

Thanks, fixed. BTW., feel invited to edit this wiki yourself.

The key feature of that article is that a formula is proposed which results in smooth f, f' and f'' (velocity, acceleration, jerk), which provides direct formulae for all three, and which bounds each function to desired limits.

That's true and also the reason why I don't consider this to be superior to the approach in the paper from Eindthoven. Maybe not even implementable (t ^^4, cubic root) to be calculated within 2 milliseconds on our limited hardware. Worse, this rather complex formula (it's the third line of formula (2) on the first page, isn't it?) has to be calculated on each time step, not only once before starting the movement.

Paul Lambrechts approach puts all the complexity into pre-calculations, but at runtime it's as simple as adding up two numbers. That's even a bit easier than @Wurstnase's approach above and triggers the "it's sane, because it's simple" detector here. A genius strike similar to the one of Mr. Bresenham.

The lesson I learn from the Bath paper is that one can overlap movements, even if they do S-shaped accelerations.

phord commented 8 years ago

Thanks, fixed. BTW., feel invited to edit this wiki yourself.

Yes, I was going to, but I was too busy to wait for the registration cycle to complete. :-]

Maybe not even implementable (t ^^4, cubic root) to be calculated within 2 milliseconds on our limited hardware.

Right. But as the paper points out, the graphs are amenable to linear approximation which should preserve the smoothing features as well. I have generated a linear approximation table of the normalized velocity data and I've done some work on velocity calculations at run time. But I keep getting interrupted by $dayjob.

The lesson I learn from the Bath paper is that one can overlap movements, even if they do S-shaped accelerations.

That was the interesting part to me, too, even though the paper mentions that it is theoretical and untested. In fact, I stumbled upon this paper while looking for a reasonable lookahead algorithm back before you got one working. :-)

After I read the paper a few times and really grokked its contents, I became more interested in the exponential function itself. But I keep getting distracted. Maybe it's not worth pursuing, but it gives me something to play with when I can't sleep.

Wurstnase commented 8 years ago

Just a short sentence for the first post. Hardware-Debugging >> all.

Short delays is everything < 160 not 100. I made some measurements and it took 130 to 152 cycles to finish the interrupt on my STM32.

Also, the very first interrupt will occur at 0! So setting up the capture-compare-register at init to 0xffffffff solves this. Without it, the ccr > cnt and cnt - ccr becomes negative and the problem begins.

So first part is finally solved!

Traumflug commented 8 years ago

Also, the very first interrupt will occur at 0! So setting up the capture-compare-register at init to 0xffffffff solves this.

Excellent catch!

I made some measurements and it took 130 to 152 cycles to finish the interrupt on my STM32.

This makes me a bit wondering what happens in this time. The seemingly computing intensive part, finding out which axis to step next, is done before setting the timer. Is testing four 32-bit values against zero that much work?

Wurstnase commented 8 years ago

I start counting in timer_set() just before the if statement and stop couting in the IRQHandler just after queue_step().

Wurstnase commented 8 years ago

Is testing four 32-bit values against zero that much work?

Well, I'm little bit unsure from where to where I should measure. The part between while(set_timer()) and unstep() is only 41 to 82 cycles.

Dave3891 commented 8 years ago

Has anyone looked at how TinyG is doing acceleration? They claim 3rd order with G1 and 6th order with G2 https://github.com/synthetos/TinyG/wiki/Jerk-Controlled-Motion-Explained

Wurstnase commented 7 years ago

While looking again into the temporal code because I what to work with the TODO:

    // TODO: instead of calculating c_min directly, it's probably more simple
    //       to calculate (maximum) move_duration for each axis, like done for
    //       ACCELERATION_TEMPORAL above. This should make re-calculating the
    //       allowed F easier.

I have some issues to understand this part:

      uint32_t move_duration, md_candidate;

      move_duration = distance * ((60 * F_CPU) / (dda->endpoint.F * 1000UL));
      for (i = X; i < AXIS_COUNT; i++) {
        md_candidate = dda->delta[i] * ((60 * F_CPU) /
                       (pgm_read_dword(&maximum_feedrate_P[i]) * 1000UL));
        if (md_candidate > move_duration)
          move_duration = md_candidate;
      }

move_duration is in ticks, but md_candidate is something in 1/µm? distance is in delta_um-units. dda->delta[i] in steps. Should this be delta_um[i]?

Later we will calculate the step_interval with:

      for (i = X; i < AXIS_COUNT; i++) {
        dda->step_interval[i] = 0xFFFFFFFF;
        if (dda->delta[i])
          dda->step_interval[i] = move_duration / dda->delta[i];
      }

move_duration is CPU-ticks and dda->delta is in steps. So we get ticks per step. Which looks ok.

Wurstnase commented 7 years ago

Verified. Fix comes soon.

Wurstnase commented 7 years ago

@Traumflug you can pick https://github.com/Traumflug/Teacup_Firmware/commit/ae2604cccd6afd and maybe also https://github.com/Traumflug/Teacup_Firmware/commit/848825b35. Ramps for temporal are not final working.

phord commented 7 years ago

I just noticed your TinyG mention. Their code relies on heavy math libraries and floating point calculations. It must take many millis to plan each movement!

I played with implementing a similar method but in simpler integer math a while back. I might revisit it someday.

Teacup does pretty impressive things with constant-acceleration already, but jerk-limited variable acceleration might make things a little smoother. With lookahead enabled, however, you will primarily notice improvements only when Teacup starts or stops to/from Velocity=0. It is possible that we could move the print head faster if we learn to limit for Jerk independently from Acceleration. But I expect the improvement would be minuscule.

Wurstnase commented 7 years ago

When you have a lot time while traveling to work, you could get nice ideas.

I think I can introduce ramps and lookahead soon to acceleration temporal. :)

Traumflug commented 7 years ago

Lookahead for ACCELERATION_TEMPORAL by overlapping movements, please. This way we get rid of these instant direction/speed changes and get support for quadratic Bezier movements for (almost) free.

Wurstnase commented 7 years ago

Maybe later. First I will take the current algorithm.

Wurstnase commented 7 years ago

temporal with ramps: smooth_temporal

acceleration ramping: smooth_ramps

New things I've learned. c_candidate in dda_step() can be negative. So in that case just step immediately the next axis. Otherwise other axes can overtake this 'negative' axis, because 'negative' unsigned integer gets big.

You will find some small spikes. This are some of those small 'negative' delays.

in dda_step():

    move_state.last_time = move_state.time[dda->axis_to_step] +
                           dda->step_interval[dda->axis_to_step];

    do {
      int32_t c_candidate;
// other code...
      int32_t dda_c = 0x7FFFFFFF;
      for (i = X; i < AXIS_COUNT; i++) {
        if (move_state.steps[i]) {
          c_candidate = move_state.time[i] + dda->step_interval[i] -
                        move_state.last_time;
          if (c_candidate < dda_c) {
            dda->axis_to_step = i;
            dda_c = c_candidate;
          }
        }
      }

      if (dda_c < 0)
        dda->c = 0;
      else
        if (dda_c == 0x7FFFFFFF)
          dda->c = 0xFFFFFFFF;
        else
          dda->c = dda_c;

      // No stepper to step found? Then we're done.
      if (dda->c == 0xFFFFFFFF) {
        dda->live = 0;
        dda->done = 1;
        break;
      }
    } while (timer_set(dda->c, 1));

smooth_spike

Current code: https://github.com/Traumflug/Teacup_Firmware/tree/acc_temporal_ramping It's my current local working version and needs some rework.

phord commented 7 years ago

You will need to fix the negative c_candidate some other way than just "step immediately". Those spikes likely represent lost steps because your stepper cannot react that quickly.

But I think you know this already. I just want to make it part of the discussion.

I'm excited to see where this goes. :+1:

Wurstnase commented 7 years ago

But I think you know this already.

I don't really think about it. But it looks like, that most spikes comes at deceleration. I guess this is, because move_state.last_time is calculated in the beginning of a new interrupt. But between it can change. On acceleration this is not a big issue because the last timer is always slower. At deceleration the dda->step_interval[i] will become slower in dda_clock().

The solution could be calculating last_time after the do-while before existing the current interrupt.

Traumflug commented 7 years ago

You will need to fix the negative c_candidate some other way than just "step immediately". Those spikes likely represent lost steps because your stepper cannot react that quickly.

While the double-step in the last picture clearly shows a bug, having negative c_candidate's isn't fundamentally wrong. Other than ACCELERATION_RAMPING, ACCELERATION_TEMPORAL always does one step per dda_step() call, only. Which means, if three steppers should do one step on each at the same time, three calls to dda_step() are required. Step on the first motor happens in time, but also consumes some time, so the other two steppers are somewhat behind, with a negative c_candidate.

Maybe this sounds like a severe penalty, but in practice, two motors stepping at the exactly same time are the exception: they happen with exactly 45 deg movements and a few other angles, only. Arbitrary angles mean evenly distributed steps on each individual motor, but arbitrarily distributed steps on all the motors together. A step loss happens only if two steps happen at the same time on the same stepper, which happens to be a bug.

Wurstnase commented 7 years ago

Step on the first motor happens in time, but also consumes some time, so the other two steppers are somewhat behind, with a negative c_candidate.

We check such things with "check_short". So negative numbers shouldn't happen. If, we need to enlarge the "check_short". https://github.com/Traumflug/Teacup_Firmware/blob/acc_temporal_ramping/timer-avr.c#L182

The double step in the last picture could onlyhappen when the step_invterval changed between interrupts. https://github.com/Traumflug/Teacup_Firmware/blob/acc_temporal_ramping/dda.c#L636-L637 In that case, especially for acceleration

move_state.last_time = move_state.time[dda->axis_to_step] +
                           dda->step_interval[dda->axis_to_step];

dda->step_interval will increase. last_time is bigger as it should be and in

c_candidate = move_state.time[i] + dda->step_interval[i] -
                        move_state.last_time;

c_candidate becomes negative.

phord commented 7 years ago

A step loss happens only if two steps happen at the same time on the same stepper, which happens to be a bug.

Right. And I hesitate to add that it is not necessarily a bug to have a "fast" step which is not immediate. The stepper driver operating in full-step mode has only four positions. When it which receives two steps too fast for it to handle with a motor already at some velocity, you might consider it ambiguous: "was it two steps forward or two steps back?"; they end up at the same position. But since the hardware is already moving, the physics will force the motor to move to the correct position anyway. You may experience some momentary torque loss, but you're not otherwise in much danger until you step three times too fast. But this is only a danger if you do not use microsteps.

Traumflug commented 7 years ago

And I hesitate to add that it is not necessarily a bug to have a "fast" step which is not immediate.

Thinking of it I tend to agree. Avoiding these fast steps requires to predict the future, which is likely quite some computational work. They also shouldn't happen at more than 4 ms between two steps (2x speed recalculation interval, < 250 steps/second). If that's still not sufficient, full step mode isn't exactly in fashion and at any higher microstepping (1/4, 1/8, ...), such a fast step is leveled out mechanically, as you nicely described it, as long as the stepper driver logic is fast enough to actually count two steps.