MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
15.95k stars 19.08k forks source link

Smoother motion by fixing calculating trapezoids and ISR stepping. #27013

Closed mh-dm closed 1 day ago

mh-dm commented 3 weeks ago

Description

Fix rounding directions in calculate_trapezoid_for_block(). Fix off-by-ones errors in ac/deceleration steps in block_phase_isr(). Half-initialize ac/deceleration_time to smooth the speed change shock that happens between segments, which is critical as jerk/deviation further adds to the shock.

Requirements

Affects all.

Benefits

The result is a smoother motion profile that follows the imposed acceleration limits with a well defined 0.5-1.5x error factor (or up to 2x if axis is starting from ~0). Errors are due to converting a real-valued motion profile into discrete numbers of steps. Fixes are general and improve S_CURVE_ACCELERATION too (no endorsement implied).

Enjoy the smoother motion or use more aggressive acceleration/jerk/deviation values for faster prints.

Before

Here's a logic analyzer capture for the X axis before the change showing massive acceleration peaks (highlighted in red): before_smooth~2 There's a bit of speed dependent measurement noise (most obvious on the right, just ignore it) since my current logic analyzer is limited to capturing at 2Mhz.

After

After, the peaks are gone/significantly reduced. Furthermore many times when switching between accel to plateau to decel the acceleration change happens over 2 steps instead of 1 (highlighted with green checkmarks). This means smoother motion / reduced jolt: after_smooth~2

Configurations

Tested by looking at the generated step/dir impulses with a logic analyzer. Tested with low jerk so it doesn't muddy the picture.

Here's the gcode for these captures:

M92 X80 Y80 ; 80 steps per mm
M204 S500
M205 X0.2 Y0.2 ; Low jerk
G0 F1750 X10 Y10 ; Using this as an adjustable delay
G0 F1200 Y13 ; Capture starts within here, note X is 0 speed
M204 S500
G0 F1500 X11.5
G0 F1200 X12
G0 F1800 X13.5
G0 F1400 X14.1
G0 F1600 X15.2
G0 Y13.1
M204 S250
G0 F2000 X15.33
M204 S350
G0 X15.55
M204 S500
G0 X15.85
M204 S750
G0 X16.5
M204 S420
G0 X16.77
M204 S250
G0 X17
M204 S500
G0 Y13.2
G0 F600 X17.4
G0 F1800 X18.95
G0 F600 X19.3
G0 F1800 X20.9
G0 F600 X21.3
G0 F1800 X23
G0 F600 X23.4
G0 F1800 X25.2
G0 F1500 X26
G0 F1200 X27
G0 F1800 X24.5
G0 F6000 X0 Y0 ; Capture usually ends within here

Related Issues

mh-dm commented 3 weeks ago

Now looking at changes with S_CURVE_ACCELERATION.

Before

I've marked the issues with red - it's mostly deceleration going abruptly to 0 instead of smoothly. before_smooth_scurve~2 There's a bit of speed dependent measurement noise (most obvious on the right, just ignore it) since my current logic analyzer is limited to capturing at 2Mhz.

After

after_smooth_scurve

Aside/Commentary

Of the times I've tried S_CURVE_ACCELERATION it never was a clear improvement. If comparing the two after graphs it's a bit more clear why:

There's also the issue that as soon as there's significant jerk/deviation allowed (therefore ~instant axis speed changes), the speed smoothness is gone. Maybe a new approach could be:

Anyway, the important part is this change also improves S_CURVE_ACCELERATION a bit.

thierryzoller commented 3 weeks ago

Kudos - This may well be the fix to some of the most hard problems to solve So how do we get the below issue re-opened with the new information above? https://github.com/MarlinFirmware/Marlin/issues/12491

thisiskeithb commented 3 weeks ago

So how do we get the below issue re-opened with the new information above? #12491

Issue has been reopened & unlocked.

thinkyhead commented 3 weeks ago

Thank you for doing this excellent analysis and correction. Everything looks great in those graphs. (And nothing pleases us more than pretty graphs.) We'll prioritize getting this reviewed and tested because this is obviously a vital area of the firmware!

Is it confirmed that the number of steps is not being modified in any way? I ask because in the past we have had issues where steps were off by +-1, so we had to take great care to use CEIL/FLOOR/ROUND/LROUND in a balanced way. Changing even one of these is always a little scary, so we want to be extra vigilant to make sure the step counts are still exact.

I agree that S-curve acceleration could use some adjustments to produce nicer looking curves. It may also be worthwhile to add S-Curve acceleration (or our best-guess for Fast B-Spline) to the FT Motion system after its extant patches are merged.

mh-dm commented 3 weeks ago

Is it confirmed that the number of steps is not being modified in any way? I ask because in the past we have had issues where steps were off by +-1, so we had to take great care to use CEIL/FLOOR/ROUND/LROUND in a balanced way. Changing even one of these is always a little scary, so we want to be extra vigilant to make sure the step counts are still exact.

I've explicitly aimed to only modify the speed profile and I didn't touch the bresenham part. I've opened up the captures and at least whenever there was a direction change the positions matched between before/after captures. I did an (admittedly simple) test print with a just slightly earlier version of this code.

I agree that S-curve acceleration could use some adjustments to produce nicer looking curves. It may also be worthwhile to add S-Curve acceleration (or our best-guess for Fast B-Spline) to the FT Motion system after its extant patches are merged.

From what I'm now reading FT Motion seems to be a vibration compensation mechanism. The whitepaper I've found mentioned 'inverted commands'. I might look into it more in the future.

mh-dm commented 3 weeks ago

@thinkyhead I added a couple comments on a couple of commits you've added (Making a comment here since I'm not sure if they're easily visible)

tombrazier commented 3 weeks ago

Of the times I've tried S_CURVE_ACCELERATION it never was a clear improvement. If comparing the two after graphs it's a bit more clear why:

* it spends a lot of time at ~0 acceleration for which it has to compensate by going up to 2x or more of the acceleration limit (the or more is referring to the ~10 step segment with off the charts acceleration)

This analysis is correct. It is good to see it so clearly demonstrated.

tombrazier commented 3 weeks ago

Note https://github.com/MarlinFirmware/Marlin/pull/26881 which also affects the planner.cpp lines

  // Limit minimal step rate (Otherwise the timer will overflow.)
  NOLESS(initial_rate, uint32_t(MINIMAL_STEP_RATE));
  NOLESS(final_rate, uint32_t(MINIMAL_STEP_RATE));
tombrazier commented 3 weeks ago

It's going to take me a while to get my head round these changes. I do have the same question as Scott about whether step counts might end up incorrect.

mh-dm commented 3 weeks ago

Re step counts: I've checked before/after captures and did a diff of the steps to verify they are correct. For the simple gcode above, the capture shows X going from 0 to 17.000mm then back to -10.000mm which matches the simple gcode above given a start right after G0 F1750 X10 Y10, going up to 'X27' then to G0 F6000 X0 Y0. But this is simple gcode. So I tried more complex gcode (modified beginning of sped up vase mode print) and (given a G4 P750 delay so the logic analyzer has time to start) the X positions are identical before/after for the first few seconds (until the logic analyzer cuts off). I've also checked the Y positions.

@tombrazier Thanks for mentioning #26881 - I've made a comment there.

On a separate note, I thought I should check how the new code interacts with oversampling. So I manually set an 2^3=8x factor:

constexpr uint8_t oversampling_factor = 3;`

Before with 8x oversampling

before_smooth_8xoversampling~2

Highlighted the spikes fixed or a bit improved (~).

After with 8x oversampling

after_smooth_8xoversampling

After I also tested with ADAPTIVE_STEP_SMOOTHING which looked similar to the above (as expected given my testing platform lpc1769).

sjasonsmith commented 3 weeks ago

It's going to take me a while to get my head round these changes. I do have the same question as Scott about whether step counts might end up incorrect.

If we have good fairly stable entry points into the impacted code we could add some unit tests to be sure it remains unchanged.

I believe I have an old branch around where I had done just that while attempting to resolve some jerk/accel issues. I need to see if I can find that and see whether those tests can be effectively updated to work with the current state of the planner.

tombrazier commented 3 weeks ago

I'm slowly working my way through this. First observation: really nice work with the logic analyser. Smoother movement is always better. I don't know whether instantaneous small speed changes will affect print quality or not but believe they might and I suspect they will at least be audible.

I think there are several discrete changes which I will review and comment on individually. @mh-dm can you confirm my understanding?

tombrazier commented 3 weeks ago

I don't think any of the changes will result in missed or extra steps. The way to test this is to build with M114_DETAIL enabled, SKEW_CORRECTION disabled and BACKLASH_COMPENSATION disabled and then run a print and check that M114 D shows the correct step count at the end of the print.

tombrazier commented 3 weeks ago

It's going to take me a while to get my head round these changes. I do have the same question as Scott about whether step counts might end up incorrect.

If we have good fairly stable entry points into the impacted code we could add some unit tests to be sure it remains unchanged.

I believe I have an old branch around where I had done just that while attempting to resolve some jerk/accel issues. I need to see if I can find that and see whether those tests can be effectively updated to work with the current state of the planner.

The code that ends up changing step counts is distributed all over the place and some of it is asynchronous. I think it would be difficult to unit test.

mh-dm commented 3 weeks ago

I don't know whether instantaneous small speed changes will affect print quality or not but believe they might and I suspect they will at least be audible.

@tombrazier I've noticed that audible judder is visible on 'sensitive' prints like spiralize/vase mode prints. In particular, the difference between non-oversampling and high oversampling/ADAPTIVE_STEP_SMOOTHING is both audible and can be visible on close inspection (if most other issues are resolved).

I think there are several discrete changes which I will review and comment on individually. @mh-dm can you confirm my understanding?

All the changes are with the goal of getting the ¯\/¯\\ as close to ideal but yeah, they can be reviewed line by line.

thinkyhead commented 3 weeks ago

The code that ends up changing step counts is distributed all over the place and some of it is asynchronous. I think it would be difficult to unit test.

The only way we can unit test motion is to add unit testing code that invokes G0/G1/G2/G3/G5, etc., wait for the move to complete, and then examine the values of many variables to make sure they match up to expectations. This is now easily possible, so the tests merely need to be designed.

As for unit testing something as complicated as a complete motion path to make sure the pulses aren't out of whack, that would require a bit more work. The Stepper ISR would need to write values of its variables into a large buffer, then the test would need to analyze the buffer for "bad stuff."

thinkyhead commented 3 weeks ago

From what I'm now reading FT Motion seems to be a vibration compensation mechanism.

It has that, but it's a near complete replacement for Marlin's motion system. FT Motion still calculates trapezoids the same way, but it runs on a fixed time clock and pre-buffers all the STEP/DIR events (including "do nothing" ones). As far as I can tell the changes here don't require any corresponding changes in FT Motion.

sjasonsmith commented 3 weeks ago

The code that ends up changing step counts is distributed all over the place and some of it is asynchronous. I think it would be difficult to unit test.

I looked back at what I had done earlier. I was focused on Classic Jerk problems at the time, which @mh-dm may have already fixed separately back in #26529. I had to refactor some things to isolate the problem logic so that I could write tests around it. The focus was on the transition speeds between blocks while applying jerk, and not in the same code sections being modified here. That refactoring is probably still worth doing so we can test CLASSIC_JERK, but doesn't help with this PR.

If we can extract and isolate important/complex/problematic logic, then it becomes possible for us to wrap them in tests that will be validated with every PR posted.

For something like step counts, that wouldn't be an end-to-end "G Code to Stepper Pulses" test. You would focus on testing the different blocks of logic on their own. If they are really dependent on each other, then we probably need to be looking at refactoring the code to bring those mechanisms together, to make testing them together possible.

mh-dm commented 1 week ago

Hi, I'm checking in on this PR.

tombrazier commented 1 week ago

There's the addition of NOMORE(initial_rate, block->nominal_rate); which is a pareto improvement - there's no case that goes from working to broken.

There is a case: if block->nominal_rate is slower than the stepper module can generate steps then the laser power calculations will be wrong. I prefer @HoverClub's original solution in #26881 which ensures block->nominal_rate is large enough rather than the solution in this PR which ensures that initial_rate and final_rate are small enough.

I now also understand what you meant @mh-dm when you said:

There are currently a few cases where block->entry_speed_sqr falls under minimum_planner_speed_sqr. Right now MINIMAL_STEP_RATE acts like a kludge for those cases and prevents the worst cases of acceleration spikes. Short explainer: The first step of a segment is always at intial_rate. If it's too slow, it will result in too much accumulated acceleration_time (see stepper.cpp) and that means a very high calculated second step rate, and the difference is the acceleration spike.

This is ultimately addressed in #27035. For now, though, this adds another reason to use the original solution from #26881, thereby keeping a lower bound on initial_rate and final_rate.

Anything else or that I've missed?

I mentioned two redundant lines of code here.

Apart from that, I think this PR is ready to go.

mh-dm commented 1 week ago

There's the addition of NOMORE(initial_rate, block->nominal_rate); which is a pareto improvement - there's no case that goes from working to broken.

There is a case: if block->nominal_rate is slower than the stepper module can generate steps then the laser power calculations will be wrong. I prefer @HoverClub's original solution in #26881 which ensures block->nominal_rate is large enough rather than the solution in this PR which ensures that initial_rate and final_rate are small enough.

But this case is already broken, in a more severe way. When block->nominal_rate is slower than the stepper module can generate steps, for all the architectures I know of it will also be lower than the MINIMAL_STEP_RATE of 120. In which case the existing NOLESS(initial_rate, uint32_t(MINIMAL_STEP_RATE)); code will force initial_rate and final_rate to be strictly greater than nominal_rate and the step computation math goes haywire. This haywire math affects all motion, not just laser machines.

I mentioned two redundant lines of code here.

I can't find what you're referring to in that link. Sorry, can you remake the comment?

tombrazier commented 1 week ago

But this case is already broken, in a more severe way. When block->nominal_rate is slower than the stepper module can generate steps, for all the architectures I know of it will also be lower than the MINIMAL_STEP_RATE of 120. In which case the existing NOLESS(initial_rate, uint32_t(MINIMAL_STEP_RATE)); code will force initial_rate and final_rate to be strictly greater than nominal_rate and the step computation math goes haywire. This haywire math affects all motion, not just laser machines.

Not in the original version of #26881 which has now been restored. That version has NOLESS(block->nominal_rate, uint32_t(MINIMAL_STEP_RATE));.

I can't find what you're referring to in that link. Sorry, can you remake the comment?

My comment was that deceleration_time = 0; on line 2320 is redundant because deceleration_time is going to be initialised at the start of the deceleration phase on line 2443.

And IF_DISABLED(S_CURVE_ACCELERATION, acc_step_rate = current_block->nominal_rate); on line 2442 is redundant because acc_step_rate is not going to be used again until the next block is initialised and it will be set to a new value at that point.

thinkyhead commented 1 week ago

From the discussion it sounds like I ought to merge the (now restored) #26881 first, then we should patch this up afterward. Is there any reason to delay merging #26881?

tombrazier commented 1 week ago

Is there any reason to delay merging #26881?

No, there isn't. It could be improved further but it's better as it is than not merging it and I can do the extra improvements at some point.

mh-dm commented 6 days ago

I can't find what you're referring to in that link. Sorry, can you remake the comment?

My comment was that deceleration_time = 0; on line 2320 is redundant because deceleration_time is going to be initialised at the start of the deceleration phase on line 2443.

Thanks. However, it's not redundant since that initialization was changed to be acceleration_time = deceleration_time = interval / 2;.

tombrazier commented 3 days ago

@mh-dm I have just realised most of my review comments were all still pending which is why only I could see them. Anyway, I have spotted some more things I want to discuss and am deleting comments that are now redundant. Hopefully I can actually publish my pending review soon.

tombrazier commented 2 days ago

I have reviewed all the code changes here and discussed questions I have with @mh-dm. Some of the changes are minor (e.g. the LROUND ones) and I am happy with them. The two major changes, i.e. fixing the off by one errors and the half-initialisation of accel / decel time can only make the motion smoother. Considering the extensive testing and really helpful graphs and analysis by @mh-dm, I think this PR can be merged. @thinkyhead

mh-dm commented 1 day ago

@thinkyhead In case you're preparing to merge, shouldn't #27031 go in before this? (or do you not want that one at all?)