Motion Inconsistency - Githubissues

CCS86 commented 6 years ago

Hi guys, I am trying to figure out what is going on here:

This is a spiral vase, running at 30 mm/s, accel @ 700 mm/s/s, jerk at 10 mm/s. Marlin 1.1.6

For the last ~10mm or so, I increased jerk to 15 mm/s, and the incidence of these "loops" seems to have dropped a bit.

Model: (scaled down in my print) Julia_Vase004-Bloom-_Solid.stl.txt

The print head is either slowing down significantly, while extrusion rate stays constant. Or the print head is deviating significantly from the true path. My theories were that I was either draining the buffer and inducing some jerkiness (like this issue: https://github.com/MarlinFirmware/Marlin/issues/9093); or that my combination of jerk and acceleration values were contributing to bad motion.

It seems like it is very hard to set jerk and max acceleration to values that work well in all situations. In this case, it seems like jerk being too low (10, which is the new "default"), coupled with low acceleration (700), causes Marlin to "trip" in segment dense areas. I have definitely seen this in printing small circles. I'm not sure why the anomalies are so inconsistent here though. However, if you set jerk too high, it can cause too much resonance, during infill for example, as you are allowing max stepper torque to be applied as acceleration bounds are bypassed (briefly).

Visualizing the gcode in Repetier, it generally looks very clean (besides a handful of isolated "bad segments"):

[edit: not sure why image preview are broken. Click for full size]

Zooming in very close, the segment density seems generally similar between layers. But maybe it is the relative angle between segments the varies; "casting a loop" when the angle change steps over some threshold. I marked a couple that look sharper than others:

Here is a closer look at the print in question (now upside down, with the higher jerk setting at the bottom):

For reference, here is a different spiral vase print, with similar settings. It shows that mechanically, this printer is doing a repeatable job. But there are a few of these "cast loops" in the print as well:

CCS86 commented 6 years ago

I decided to try to prove which was the culprit and designed a couple models to do so. I output them from CAD with a 0.005mm tolerance, to create a very dense mesh and let the slicer control the segment length. They are both printed with the same spiral vase, 0.15mm layers, jerk=10.

Video of the print test results: https://youtu.be/qqDKktcJgmM

The first model is decreasing radius arc moves. From r=8mm down to r=1, then a couple sharp corners at various angles. I injected gcode to set acceleration changes every 5mm of Z height. Starting at 600, 1000, 1400, 2000. Print speed was (set to) 30 mm/s (but minimum layer time dropped this to 16 mm/s). I thought that I failed to reproduce the issue until I looked more carefully at the print. Only the r=1mm arc shows these motion glitches. They seem random enough that it's hard to say if acceleration was a driving factor.

I figured that short arcs connected by longer straight segments gives the buffer plenty of recovery time and designed a new model. This one has intersecting circles of different radii, to keep the printer chugging through dense code. This print has acceleration changed every 5mm too (but is only 15mm tall), 500, 1000, 2000. Jerk remained at 10. The first one printed at 16 mm/s and the second at 29 mm/s (via feed override).

At 16mm/s the part is very clean. Only the two smallest arcs (r=.5mm, r=.75mm) show the glitch. At 29 mm/s the glitches are pretty bad and can be seen up through r=2.5mm. Again, no real dependance on acceleration.

I re ran that 29 mm/s test again with jerk set to 15 (not in the video). Interestingly, with jerk at 15 and acceleration at 500, it is almost glitch free. But at acceleration 1000 and 2000 it looks identical to jerk=10.

I don't feel like I've proven anything besides being able to reproduce the issue on a test part, with straight vertical walls, and that increasing speed makes the issue worse. Whether that is buffer related or not, I have no idea.

CCS86 commented 6 years ago

MotionTest010.gcode.txt

Motion Test 2-010.gcode.txt

AnHardt commented 6 years ago

Cura is set to keep segments lengths at or above 0.010mm. So, at 30 mm/s, we are pushing a maximum of 3000 segments/s. With my stepper settings, 4800 steps/s.

4800/3000 -> 1.6 steps/gcode move. With the usual 6 MIN_STEPS_PER_SEGMENT Marlin will throw away ~4 from 6 recieved g-code lines, what will drain the block buffer very fast. 4800steps/s / 30 mm/s -> 160steps/mm. 6steps /160steps/mm = 0.0375mm is the smales useful setting for Cura.

Marlin can easily step 4800steps/second, but will fail to plan 3000 moves/second. The delta configurations suggest

#define DELTA_SEGMENTS_PER_SECOND 200

For cartesian printers that may be a bit more.

CCS86 commented 6 years ago

@AnHardt those values only represent maximums though. It does not say that every segment will be 0.01mm, just that the minimum segment length will be no shorter. There might be a little bit of leeway to loosen the slicing tolerance, without degrading the quality of the pur gcode noticeably. But check out this vertical walled model sliced at 0.010mm and 0.038mm:

AnHardt commented 6 years ago

The upper picture looks obviously better, but that's only what you sliced. Marlin will not print segments shorter than MIN_STEPS_PER_SEGMENT! So the printed result will look at best like the lower one - when the buffer does not run dry.

CCS86 commented 6 years ago

I guess that depends on how Cura handles its "resolution" setting vs how Marlin drops segments.

My guess is that Marlin would give you a more consistent result, especially on a straight walled part like this. Cura clearly is inducing random positional error for each layer.

Just because your gcode looks pure, doesn't mean that the printer can achieve it. But, if your gcode looks crappy, there is somewhere around a zero percent chance the printer will fix it.

CCS86 commented 6 years ago

I double checked this code in http://www.gcodeanalyser.com/ (thank you @Sebastianv650) and realized that minimum layer time had basically cut my speeds in half (16 mm/s and 29 mm/s).

Looking closely at the code, Cura seems to have done a nice job slicing consistently. Even though the "resolution" was set to 0.010mm, the segments in these small arc sections, are ~0.155mm long.

This means that the glitch shows up (just barely) at 16 mm/s / 0.155mm = ~103 segments a second or 2560 steps/s. The glitch is really bad at 185 segments/s or 4608 steps/s.

These moves are ~25 steps each. So minimum steps/segment doesn't seem like the issue.

Sebastianv650 commented 6 years ago

I think it's all about segments per second. Around 100 also sounds quite good to me. As a reference, I tested my upper segments/s limit once ago and got around 70. So you can see, the magnitude is comparable. The number itself will always differ as long as we have no standard way of benchmarking Marlin with a standard gcode.

CCS86 commented 6 years ago

What happens when you get to that 70 segments/s? Buffer is drained and motion hiccups?

It is segments/lines that Marlin chokes on, not steps?

Sebastianv650 commented 6 years ago

In my special test gcode 70 was the limit. Which means the produced blocks are eaten faster by the stepper ISR than the planner can produce the blocks. Therefore the buffer is drained and hiccups occur while the stepper is waiting for new blocks.

It is segments/lines that Marlin chokes on, not steps?

Hard to say that for every possible case, but in general yes. Calculating all the needed things for a new block takes time, while just executing steps is a much more easy task. I checked the times a long time ago, so don't take it for recent Marlin version. But in these days I found that planning a new block and takes about 2ms. After that, the planner has to optimize the buffered steps, adjusting their junction speeds and calculating the trapezoids. That needs another 1ms. In total, each new block needed 3,15ms. One loop through the stepper ISR took "only" 50µs or 0.05ms.

CCS86 commented 6 years ago

Did some more testing tonight, which seems to confirm that Marlin is choking on blocks/s.

I output a version of my motion test model at 3 different resolutions: 0.010mm, 0.020mm, 0.030mm

They were sliced with the same settings and bed position, then I spliced the code to print one after the other. I tried to use M76 pause between prints, but luckily I was hanging out because Marlin did not obey! I did the same thing, starting each print at 500 mm/s acceleration, for the first third of the print, then 1000 and 2000. Jerk stayed at 10; speed at 30.

(make sure to click for a full size. The preview looks heavily banded from downscaling) img_9733-2

On the 0.010 res print, it is quite a mess at the small end of the print. It was disappointing to see issues in even the 3mm radius arc @ accel=500. These mostly cleaned up at accel=1000 & 2000. On this r3 arc, the segment length is 0.323mm... about 93 blocks/s. I think it is interesting that acceleration seemed to have an effect here. gcodeanalyzer doesn't show it as having an effective velocity change. Thoughts?

The 0.020mm res print looks pretty great, except at the r0.4. Again, these clean up at higher accelerations. The arc segments are ~0.104mm.

The analyzer suggests these moves would be slowed to ~18mm/s. About 173 blocks/s. Hmm, that's quite a bit higher. Let's look at a clean arc...

This one is clean regardless of acceleration. Arc segments ~0.325mm. That's ~92 segments/s, almost the same as the first dirty arc I looked at. Why did this one print clean, and the other one struggle? Do these short arc segments not contain enough segments at their segment rate to be meaningful, in the context of the whole buffer? Are they inheriting issues from the arc before?

CCS86 commented 6 years ago

Spliced gcode: MT2.gcode.txt

Individual: MT2-010.gcode.txt MT2-020.gcode.txt MT2-030.gcode.txt

Models: MT2-010.stl.txt MT2-020.stl.txt MT2-030.stl.txt

CCS86 commented 6 years ago

Thinking about a test part more like this:

CCS86 commented 6 years ago

MT3.stl.txt

Sebastianv650 commented 6 years ago

These mostly cleaned up at accel=1000 & 2000. On this r3 arc, the segment length is 0.323mm... about 93 blocks/s. I think it is interesting that acceleration seemed to have an effect here. gcodeanalyzer doesn't show it as having an effective velocity change. Thoughts?

This can be explained. There are two elements that will play into this:

A segment acceleration / deceleration part will need significantly more time within the stepper ISR because it needs to compute the next step rate and the time it has to wait until the next step ISR should happen. For a cruising part of a block, this two values are constant and precomputed for the segment. If the ISR loop takes some more time, there is less processing time left for the planner calculation. As a result, you can add less segments per second to the buffer.
The second part of the answer is regarding the planner optimization and junction speed calculation. Let's assume we are running at a speed not too high so the buffer is somewhat full at all times. After adding a new block as described somewhere before, the planner goes through all of the segments inside the buffer and calculates (simplyfied notes):
- From last segment to first (reverse pass): It assumes the last (newest) block in the buffer will be the last block ever. This means, the printer has to be able to come to a full stop at this block. So the final speed of the last block has to be 0mm/s. With the segment length given in mm and the acceleration, it can compute the start speed of this segment from which the printer can decelerate to that final speed of 0. This start speed is then used as the final speed of the second-last block in the buffer, again the max. allowed start speed of the second-last block is calculated. And so on until the first segment.
- From the first segment to the last (forward pass): Now it checks that no junctions speed exceeds the speed it can reach whith the acceleration given starting from the current print speed. Again this is done segment by segment.
- Finaly the trapezoid is calculated for each segment: How many steps we need to accelerate and decelerate to reach the given speeds. If you have a high acceleration value, even with short segments the planner will reach a point where the nominal speed of the segment is reached. Due to optimization flags for each block the planner takes care not to do calculations on segments that are not needed. If the final print speed is reached after accelerating through say, 3, segments, there is no need to calculate all the other segments also, and caculate a new trapezoid for them. Therefore the segments/s will be higher. With a low acceleration, it will maybe never reach the final print speed due to the 16 segments buffer size * with the low segment length. This means, it has to recalculate every segment after a new one is added always. Results in lower segments/s capability.

CCS86 commented 6 years ago

I went ahead and printed the MT3 model I posted. It was output at a 0.020mm resolution, which doesn't seem extravagant. The facets are easily visible in the model and the print. For some things, it would definitely be nice to print above these levels of detail.

The print ran at 50 mm/s, which is a little faster than I usually print outer walls, but way below a lot of speeds I see people printing at. This would be a nice speed to print inner walls, slowing for the outer wall. Again, the first 1/3 was at accel=1000, then 1500, then 2000. The analyzer shows that some of the tighter radii never reach 50 mm/s, even at accel=2000.

Again, results not so good, but very inconsistent. I would think that if we were running into the calculation limits of Marlin running on a 2560, it would look universally bad. What I am seeing is maybe 10-20% bad, scattered throughout, and the rest really good. How can we explain that? Also, this time increasing acceleration seems to make things worse.

Looking at the blocks in the r2.5 section, they are ~0.55mm. Around 91 blocks/s

It's just disappointing that I can't even get clean paths at 50mm/s on inner walls, because these defects will show right through the outer wall. This forces all wall paths to be printed at slower speeds and increases the delta between infill and wall speeds, which on my bowden printer is not ideal. Does this seem "normal" to you guys? Or is this something potentially new in Marlin 1.1.x? I have not rolled the firmware back to test yet.

Sebastianv650 commented 6 years ago

What I am seeing is maybe 10-20% bad, scattered throughout, and the rest really good. How can we explain that?

Marlin is not only printing. We have heaters to manage, a LCD to update, serial communication going on etc. The LCD is a good example which is already opimized. Some time ago, each LCD update was visible in the print when you push it to the limit as you do as it needed a lot time, and in this time no new segment can be added to the buffer. Navigating through the LCD menu was even worse while printing. After a lot of optimizations, it's much better nowadays. But it still has an influence.

Does this seem "normal" to you guys?

Yes, I'm used to it. What makes things worse in your case with bowden system is that each small short speed drop will create some kind of the visible blob due to the high compression stored inside the bowden filament section. This visible fault is much less pronounced (but still there) with a direct drive system. I'm not happy with that, but as long as we are using such low-power chips like the ATMega (and I'm not sure how much better it is on the new Marlin 2.0 boards) there is no way around, especialy as long as we want all that "bling-bling" like displays, bed leveling, skew compensation or even kinematic printers.

Or is this something potentially new in Marlin 1.1.x? I have not rolled the firmware back to test yet.

Well, a comparison would be nice. But to get compareable, stable numbers between test runs might not be easy as long as we have no standard benchmark procedure which can be used by everyone. But I think you are at a quite scientific level with your segment/s readings, if you want to compare them that would be intresting! Just make sure your configuration files are as comparable as possible. I'm quite sure Marlin got faster over time. A lot of optimizations were done to the planner and also to blocking elements like the display update thing. When I measured the numbers for how long a planner run and stepper ISR loop take, which was pre-1.1 release RCBugfix I think, it got faster over 1-2 versions I compared.

CCS86 commented 6 years ago

Yes, I realize that Marlin is handling quite a few other functions, besides motion. It seems like these would fall into a steady state situation though (roughly). The temps and fan speeds stay the same, so the PID duty cycles should be fairly consistent. Read rate from the SD card should be stable on a print like this. LCD refresh is a constant, no?

I agree that 1.1 seems much better when interacting with the display. 1.0 used to pause the print when I clicked. Now it does not. There is a request to use the Reprap discount full graphic smart controller's integrated text display, instead of bitmaps to speed things up.

Speaking of... what can I do to optimize Marlin and improve these motion glitches on my Ultimaker? I am already not using bed leveling, skew correction, etc. I'm guessing most of the hungry stuff is disabled by default? I wish there was just an easy drop-in hardware replacement to give it more grunt.

CCS86 commented 6 years ago

I tried to roll back to Marlin 1.0, but wasn't able to.

I'm running Arduino 1.8.5, maybe there is an incompatibility. After fixing a compiling error, it just wouldn't connect to the 2560 board. Error:

avrdude: ser_open(): can't open device "\\.\COM3": Access is denied.

Reflashed 1.1.6 no problem.

Sineos commented 6 years ago

I think this: https://github.com/KevinOConnor/klipper or https://github.com/MarlinFirmware/Marlin/pull/7047

Should be the way to go also for Marlin. There is an abundance of very cheap but spicy boards (RPi 3, Odroid XU4 etc) that could easily take any calculation thrown at them. Why not have an Arduino like board as Slave or Client while a Master / Server does all the calculation and just send ready commands to be executed? Also from a budgetary PoV this will do nothing. Having a cheap ATMega combined with a RPi3 is about the same price (even cheaper!) as a shiny Azteeg X5 mini.

Sebastianv650 commented 6 years ago

@CCS86 I see no straight-forward optimizations. Both ways @Sineos mentioned are very promising approaches, but as long as #7047 is not fully implemented it's not a drop in solution.

There are a few things in configuration_adv that you can have a look on:

define DOGM_SPI_DELAY_US can speed up the LCD update. Try and error for the smalest number that does not corrupt the LCD view. My one can live with 4us for example.
Try if using #define TX_BUFFER_SIZE and / or #define RX_BUFFER_SIZE improves your speed.
Try #define NO_VOLUMETRICS if you don't need them.
define NO_WORKSPACE_OFFSETS is another interesting one.

CCS86 commented 6 years ago

Thanks @Sineos I really like the idea of bolting on some computational horsepower. and thanks @Sebastianv650! I'll try playing with those settings. Wouldn't increasing the LCD refresh take away from planner calcs?

I switched to 8 microstepping on X and Y today and did an identical MT3 print. The results are very interesting and show that it isn't purely a blocks/s issue I am running into. 16 microstepping seems to come with a computational penalty.

Video of the results: https://youtu.be/wC9cPla1uk8 (I stabilized the video a bit. Hard to handhold a 60mm macro lens!)

Sebastianv650 commented 6 years ago

Wouldn't increasing the LCD refresh take away from planner calcs?

In a nutshell this option decreases the pauses between signals sent to the LCD. Therefore the lower the delay, the more time we have for the planner.

I switched to 8 microstepping on X and Y today and did an identical MT3 print. The results are very interesting and show that it isn't purely a blocks/s issue I am running into. 16 microstepping seems to come with a computational penalty.

I watched the video, but I didn't get your statement. Your print is much better at 8 microstepping, meaning half steprate and therefore more time for the planner. So why do you think there is something more going on?

CCS86 commented 6 years ago

define DOGM_SPI_DELAY_US can speed up the LCD update. Try and error for the smalest number that does not corrupt the LCD view. My one can live with 4us for example. In a nutshell this option decreases the pauses between signals sent to the LCD. Therefore the lower the delay, the more time we have for the planner.

Cool, I will give that a try.

Try if using #define TX_BUFFER_SIZE and / or #define RX_BUFFER_SIZE improves your speed.

My current build has TX_BUFFER_SIZE defined at 0, and RX_BUFFER_SIZE is commented out. Do you have a suggestion for starting values, and what to watch out for?

Try #define NO_VOLUMETRICS if you don't need them.

I can't find that parameter in my 1.1.6 build files. Maybe it was added for 1.1.8?

define NO_WORKSPACE_OFFSETS is another interesting one.

Done

I watched the video, but I didn't get your statement. Your print is much better at 8 microstepping, meaning half steprate and therefore more time for the planner. So why do you think there is something more going on?

That was in reference to our conversation here. You said:

I think it's all about segments per second.

I replied:

It is segments/lines that Marlin chokes on, not steps?

Hard to say that for every possible case, but in general yes. Calculating all the needed things for a new block takes time, while just executing steps is a much more easy task.

CCS86 commented 6 years ago

Still struggling with this. Concentric infill at 60 mm/s :/

https://www.youtube.com/watch?v=LOkfg-KbApI

Sebastianv650 commented 6 years ago

Intresting example. Can you upload the gcode so I can have a look at this infill?

CCS86 commented 6 years ago

Circle.gcode.txt

Sebastianv650 commented 6 years ago

As expected the slicer is taking all the segments from the outer perimeter and just scale it down for the inner rings. Therefore the segments / second is approaching infinity in the center.

CCS86 commented 6 years ago

Shoot, I should have checked that!

boelle commented 5 years ago

@CCS86 did you check it? :-D

boelle commented 5 years ago

@thinkyhead i think we can close this one

boelle commented 5 years ago

@thinkyhead Another one that needs to be closed. Sorry, I got bored.

github-actions[bot] commented 4 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

MarlinFirmware / Marlin

Motion Inconsistency #9219

define DOGM_SPI_DELAY_US can speed up the LCD update. Try and error for the smalest number that does not corrupt the LCD view. My one can live with 4us for example.

define NO_WORKSPACE_OFFSETS is another interesting one.

define NO_WORKSPACE_OFFSETS is another interesting one.