MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
16.29k stars 19.24k forks source link

*updated* Graphical LCD refresh makes the motors to stutter (Arduino2 and SKR1.3, all 32bit boards?) #12461

Closed mignolo4 closed 5 years ago

mignolo4 commented 5 years ago

Hi, after I found the way to make the LCD work (see issue #12294 ) I 'm experiencing some stutters whem moving x-y axes on my self-made Delta-style 3dprinter (RADDS +FULL_GRAPHIC_SMART_CONTROLLER).

If I send a simple movement like G1 X-140 I can clearly see that the movement in not smooth. edit: video added - you can find it after the picture I posted below

I tried to narrow down the problem and I found that the movement is smooth if:

Already tried:

No problem at all with other firmwares! Am I doing something wrong?

thinkyhead commented 5 years ago

You may have the display controller configured in such a way that it takes a very long time to update the display, in which case the planner could experience starvation. However, Marlin times the duration of display updates and throttles back the frequency if it detects that planner starvation is imminent. So the display update time would have to be pretty extreme for planner starvation to occur.

thinkyhead commented 5 years ago

It would help to have your configuration files, as requested in the issue template. Please put them into a ZIP file and drop them on your next reply.

AnHardt commented 5 years ago

Would also be interesting to know if moving slow stutters more than moving fast - much faster.

AnHardt commented 5 years ago

My suspicion is you are moving slow. Because of segments_per_second the sub moves become small - very small. Much smaller than min_steps_segment. So only every now and than a valid, large enough, segment pases the planer and enters the planer buffer. Because the algorithm for delaying the display update needs an exception, (It does not delay when the planner buffer is completely empty), and the buffer is almost always empty, it has no chance to fill up because of the time consuming display updates. Also the time needed to step, a that short segment, is too short to get the next valid one planned.

Solution would be to limit the shortness of the sub-moves to about min_steps_per_segment. https://github.com/AnHardt/Marlin/commit/793a50bc6ff42f5e6ae38921c10478d387cae0fb can limit the size of the subsegments to about the right magnitude. The smallest segments will still be too short. But two of them will give always a usable block, while now it's possible, if slow enough, to need 20, 200, 2000 or more subsegments to make a usable block.

Even when you switch off the display you currently get lots of to short unconnected moves, but the valid ones will be evenly distributed in time, without the breaks for the display update.

Just a theory!

ADVANCED_OK can give the answer - the state of the planner (and other) buffer.

mignolo4 commented 5 years ago

edited: added picture + video, look at the bottom!

Hi, first of all, thank you for your replies.

@thinkyhead: I'm sorry Scott but I think there's not so much to see in my configuration files because, as I said, they are almost standard! I download the lastest release, change 4 parameters for a quick test on my delta (board, endstops, thermistors, and obviously the lcd); that's all.

What's not standard is the way I make the lcd work. Because if I only enable the graphical lcd #define REPRAP_DISCOUNT_FULL_GRAPHIC_SMART_CONTROLLER everything works as expected (but I can't see anything on the display)

Then, when I change this in the ultralcd_DOGM.h file:

#elif ENABLED(U8GLIB_ST7920)
  // RepRap Discount Full Graphics Smart Controller
  #if DISABLED(SDSUPPORT) && (LCD_PINS_D4 == SCK_PIN) && (LCD_PINS_ENABLE == MOSI_PIN)
    #define U8G_CLASS U8GLIB_ST7920_128X64_4X_HAL
    #define U8G_PARAM LCD_PINS_RS // 2 stripes, HW SPI (shared with SD card, on AVR does not use standard LCD adapter)
  #else
    #define U8G_CLASS U8GLIB_ST7920_128X64_4X
    #define U8G_PARAM LCD_PINS_D4, LCD_PINS_ENABLE, LCD_PINS_RS     // Original u8glib device. 2 stripes, SW SPI
    //#define U8G_CLASS U8GLIB_ST7920_128X64_RRD
    //#define U8G_PARAM LCD_PINS_D4, LCD_PINS_ENABLE, LCD_PINS_RS 

the lcd starts to work but the movements start to stutter!

@AnHardt I understood 20% of what you said :-) because I'm not a programmer or similar but only a passionate guy following this amazing project, sorry! Anyway, I gave a glance at your commit but I noticed the code has changed from that time and I was not able to insert those lines in the last versions. Concerning the speed I don't think I'm moving slow because the stutter become worse and more visible if I increase the speed (if this was what you meant).

I am available for any test or solution you suggest me.

Here the "quality" of the print with the stutter: 20181118_142505

... and a video showing THE stutter :-) https://youtu.be/SYVsJR98xgk

mignolo4 commented 5 years ago

@thinkyhead

A little big step in the right direction (I suppose):

I tried more or less 30 different versions going back in the past and Marlin-604b804125571782a11ff819b29a062b23879ba0 (27sept2017) doesn't show the issue!

I'm going to find out when excatly the problem was introduced! Do you think it's a good idea or I'm wasting my time?

marcio-ao commented 5 years ago

@mignolo4: Usually if you can find the commit in which something started going bad, it's a great first step. However, over a year ago.... that's a lot of commits to look through!

mignolo4 commented 5 years ago

It's definitely something happened from 27th sept to 7th oct 2017! Going back to check, uhm, not really, going to bed now! :-)

thinkyhead commented 5 years ago

Interesting. If you can narrow it down to a specific day, even better!

mignolo4 commented 5 years ago

@thinkyhead

I think I found it! The last version working well with my configuration is commit: https://github.com/MarlinFirmware/Marlin/commit/604b804125571782a11ff819b29a062b23879ba0 If I understand correctly how commits work, the one came immediately after was https://github.com/MarlinFirmware/Marlin/commit/88f9194168c120093cc271689be0578307c1edca

... and that one last commit introduced the stutter!

Now it's up to you Scott to spread the magic!

thinkyhead commented 5 years ago

I’m glad to hear you found a point in time that works. The referenced commit would not have had any effect, so we’ll have to look at the general timeframe.

mignolo4 commented 5 years ago

Does it help if I check one by one all of the commits of that day to see if some of them work well? So you can focus on the others?

EDIT: checked some more commit of the same day and only https://github.com/MarlinFirmware/Marlin/commit/c869dc97452ed78c6fcf4f877a40afc8eaa49c45 is SMOOTH.

To check my theory I went back compiling some versions of 26th and 24th Sept expecting smooth movements: surprisingly enough I find out that those versions are NOT smooth again... so now I'm confused. I think there's not much more I can do with my knowledge.

mignolo4 commented 5 years ago

After a lot of time I'm here again to add some information. I had the possibility to try another LCD, non graphical this time (RRD Smart Display aka LCD2004) with the same configuration and it works flawlessly without any problem. Any news on that issue?

boelle commented 5 years ago

@mignolo4

This Issue Queue is for Marlin bug reports and development-related issues, and we prefer not to handle user-support questions here. (As noted on this page.) For best results getting help with configuration and troubleshooting, please use the following resources:

After seeking help from the community, if the consensus points to to a bug in Marlin, then you should post a bug report. Before posting a bug report please test with bugfix-2.0.x to check if problem is gone

mignolo4 commented 5 years ago

@boelle sorry, I know I'm not expert at all but honestly I thought that was (is) a BUG and I was trying to be helpful in some way

boelle commented 5 years ago

uppps... i just saw the question label

but is the bug still there in latest bugfix 2.0?

mignolo4 commented 5 years ago

:-) really don't know, last time I tried was one month ago; I'll try again soon

boelle commented 5 years ago

mabe close this one and open a new when you have tested?

mignolo4 commented 5 years ago

Just tested with no good news :-( Do you think it's better to start a new issue for a better visibility?

boelle commented 5 years ago

nope, we just keep this one open

mignolo4 commented 5 years ago

I bought a new SKR v1.3 board (LPC1768 32bit board) and the stutter is still there (again: with a delta config). So now is quite obvious that's a display issue.

edit: I can give you guys a new information I've never noticed before: The stutter is there only in the main screen! It seems that when the screen updates with new coordinates of the axes the stutter appears. Everything is buttery-smooth if I enter the menu!

Here is a video with the slowmotion function because it was the only way to see it in a video :-) Pay attention at what happen in the lower-left corner in the left motor https://youtu.be/SgVZHnPevCo

gloomyandy commented 5 years ago

@mignolo4 Are you still using a modification to get the display to work? If so did you try it with just the standard configuration? If you are still using that modified config, please post the details of the display you are using. Please don't just say it is a FULL_GRAPHIC_SMART_CONTROLLER because we need to know exactly which one it is, there are many different manufacturers of these displays. Please post pictures of the boards and a link to the supplier if possible. Many, many people are using the SKR board with a REPRAP_DISCOUNT_FULL_GRAPHIC_SMART_CONTROLLER with no problems (I have two sitting here in front of me working fine), so we need to identify why yours is different.

mignolo4 commented 5 years ago

@gloomyandy no modification with the skr1.3 (only the display define) since it has everything you need onboard for the lcd, so exp1 and exp2 connected directly to the lcd and even working with 2 (two) meters flat cable :-)

Before I write the info about the lcd board... are you using a delta configuration? Because only with that one the issue occurs!

edit: pictures added 20190504_141613 20190504_141635

gloomyandy commented 5 years ago

So just to be clear your display works fine with the standard configuration with the SKR board and a recent build of Marlin? You are not making any changes to the U8GLIB settings (as you reported you had made before). The only display related change you have made is to define the display type? What have you got it set to?

When did you download the version of Marlin you are using on the SKR board? I assume you are using standard Marlin not the version from Bigtreetech?

You should probably upload your current configuration files for this new board.

mignolo4 commented 5 years ago

To be clear: a week ago I compiled the very last bugfix version of marlin (not the bigtreetech fork), I chose the delta config from the examples, I defined the "#define REPRAP_DISCOUNT_FULL_GRAPHIC_SMART_CONTROLLER" and the other settings to use the skr board. No other modified setting or files ;-)

But... as I stated before, it's not a SKR problem, my previous board was an Arduino2 !

gloomyandy commented 5 years ago

I'm not suggesting that your problem is SKR related. But your previous report contained changes to files that may have been considered to contribute to the problem. If you are no longer making those changes then the issue will hopefully be clearer.

hobiseven commented 5 years ago

@gloomyandy Well, seems this is really looking similar to what we experience on our Alfawise and STM32 ( see #12403 ) !! i will go into the menu as written above and check if we get also the effect, as we get it in the main screen.

gloomyandy commented 5 years ago

@hobiseven Yes I was thinking that! The fact that when in a menu there is no stutter is probably not surprising, I suspect that there is no screen update in that case (or at least the update is different). When in the home screen the display is being updated to show, x,y,z temperatures etc.

hobiseven commented 5 years ago

@gloomyandy We are now starting to replace parts of the code / LCD updates with DMA accesses. I will measure tonight the possible speed improvements. I now have a really simple logic analyser trigger, allowing me to detect planner starvation : No active clocks on X and Y for more than 2ms... We will try different DMA combinations. Our LCD is a 320x240, and to make a 2x zoom, we have to duplicate data, which also takes time. We will keep you updated on the progress. One thing is clear : the commited MKS robin code in marlin 2.0.X does not work properly as is, for 150mm/s prints.

hobiseven commented 5 years ago

@gloomyandy Well, I have some bad news... We have cleaned up and speeded up our display code as much as we could using DMA running in parallel to the CPU, and we have pretty much the same result. As I now have a very simple trigger condition on the logic analyser, I have a very robust test that give me reliable results : No clocks on X+Y for more than 15ms > Trig. I have a test gcode which is a large flat washer, to avoid Z movements, and have only circular motions. With that gcode, with all improvement, or no improvements at all on the LCD ( DMA or no DMA), we have about 60 "jumps", or clock holes in a few minutes. Same as before, 16.25ms long.

We then removed all the display code and touchscreen code, we removed the second serial port, as well as made sure to remove all compiler debug flags > We still get 5/6 clock holes per gcode print. Please note that sometimes it is less. The printer is controlled via octopi.

Do you know any of the people using marlin 2.0.X that could check whether they have similar jumps on ST32 or other CPUs..., using a logic analyser? Our board is a STM32F103VE @ 72 Mhz.

@mignolo4 This is a test you can try. @pinches This is an information that you might find intresting above. You might want to replicate the logic analyser test.

I work with @tpruvot who really made a very clean LCD code, trying to optimize / remove un-needed code. We would need some sort of code profiling tool I think, to find out what is going on.

gloomyandy commented 5 years ago

Unfortunately all my boards are LPC176x based so I probably can't help. What I'm trying to understand here is what is it that is stopping the main stepper interrupt from running. That should be the highest priority of all (or at least very close to), so something must either be running at a higher priority or must have disabled interrupts. You could perhaps add code into the LCD refresh routine to see if it is ever called with interrupts turned off, I don't think it should be. We may then be able to track down what is going on.

hobiseven commented 5 years ago

Hmmm, well, what is strange is that even with all the LCD code removed, we still see some of those clock s stopped. What I am thinking is that the ISR might still run, but there is no step command passed to the ISR. We are getting back to place a gpio pin in the ISR routine... I will do that. One question : I was looking for a high level architecture document of Marlin code. Does this exists anywhere? Your boards are LPC176x.. Well, it would be quite intresting for you to hook a LA on that, and check is you have any of those strange "holes". You actually do not even need an LA, you can feel the machine vibration by hand. as soon as those clocks are missing, of course the machines vibrates a bit more. Ok, I will place a gpio ISRback on the Zmax pin. Let's see.

gloomyandy commented 5 years ago

It may not be the LCD code that is the problem. It could be that some other part of the code is turning off interrupts but then calling code that eventually updates the LCD (but with the interrupts off), it may be that most of the time the interrupts are only disabled for a very short time (so no problem), but from time to time the code path is such that the LCD update gets called which takes a long time with the interrupts disabled (which is a problem).

hobiseven commented 5 years ago

Well, we will try to debug this! Let’s see where we end up with gpio

hobiseven commented 5 years ago

@gloomyandy Regarding this thread, it is closed for me, as we have prooved that the gaps in the steppers clocks is NOT caused by the LCD code. it is simply aggravated. It is either a lack of CPU time, which I doubt, or an overall schedule issue in the code. As I said, we will continue debugging, and look at why the step pulses disapear, but this is not directly an LCD code issue. But the bug remains.

Thank you for your help.

mignolo4 commented 5 years ago

@thinkyhead Sorry but I do not understand the reason why this issue has been closed...

Yesterday I tried the most recent bugfix version but the problem is still there. I see other people on facebook and on some forum having this issue too.

EDIT - some new clue:

solrac8874 commented 5 years ago

I had problems with bad stuttering too. Not sure if it was the root cause, but when when I disabled ADAPTIVE_STEP_SMOOTHING in Configuration_adv.h the stutter went away; at least as far as I could tell. Prints well now.

It's kind of weird because before when using a MKS Gen L board I didn't have the same problems with ADAPTIVE_STEP_SMOOTHING

mignolo4 commented 5 years ago

Already tried that and it's not my case :-(

DerAndere1 commented 5 years ago

@hobiseven regarding high level code architecture. I made a very rough code overview for my own purposes a year ago. It's probably not what you need, but here is the link anyway: https://github.com/DerAndere1/Marlin/tree/Marlin2ForPipetBot/docs

thinkyhead commented 5 years ago

@mignolo — You can get less demanding status screen drawing code by switching off all the fancy additions to the status screen and using the single header image instead of the individual nozzle and bed bitmaps.

If you are using 32x micro-stepping on any of your axes, go down to 16x micro-stepping.

The general problem is not that the screen drawing blocks the stepper ISR. It doesn’t interfere with it. However, screen drawing temporarily blocks the G-code queue from feeding the planner fast enough, and when the queue starves we get stuttering movement as the machine stops and starts.

There are various other things you can also try:

That’s all I can think of off the top of my head….

AnHardt commented 5 years ago

The good news is: On the 32-bit boards the time used for 'drawing' the screen becomes less and less important. The bad news is: Even on the AVRs, the time for 'drawing' the screen is lower than that for 'transfering' the data to the display. The worst news is: We already send the data to the ST7920 as fast it can read them. We will never be able to improve this. Except by reducing the amount of data, like we try to do with LIGHTWEIGHT_UI for the status screen. Stars at the horizon: We could try to shift away the load for the display updates, away from the main process(or) to DMA or external processors (intelligent displays handling the complete UI). U8g2 now seems to be able to transfer only 'dirty' areas of the screen. That could decrease the amount of transferred data in the statusscreen (not in the menus where the complete screen contend changes with every encoder event - but in editing a value).

thinkyhead commented 5 years ago

the time for … 'transfering' the data to the display

Yes, this is not something we can do much about with the current u8g library, since each SPI bit requires a small delay, and the transfer is bound to the main loop. If it were possible to rewrite all of Marlin's SPI code and all the external libraries, then we could perhaps do all the SPI transfers in a coordinated interrupt. This would allow us to get back to managing the queues more quickly. Someday, perhaps…

Roxy-3D commented 5 years ago

Also... As a brute force fix to give the main Marlin code more processing cycles... You can change

#define LCD_UPDATE_INTERVAL 100

to 250 ms. The LCD encoder wheel will feel a tiny bit 'sluggish'.
But the main Marlin loop (with the important code executing!!!!) will have way more time to process stuff.

thinkyhead commented 5 years ago

That's definitely going on the Troubleshooting > LCD / Controller section of the site. The amount of cycles freed up is going to depend on what's happening in the LCD code. There are some general housekeeping tasks and I'm sure the time adds up when doing them 100 times a second.

AnHardt commented 5 years ago

100 times a second

Of course. But it's every 100ms (100 * s/1000), so 10 times a second.

tpruvot commented 5 years ago

see https://github.com/MarlinFirmware/Marlin/pull/14595/files#diff-54ada0858fc4ed0c0dfc06aa12378c1dR834

its doable to improve the menu refresh while scrolling... and let the status screen as is... while printing

thinkyhead commented 5 years ago

At one point @AnHardt worked out a very good throttling mechanism that would prevent the screen from updating if it looked like the planner could starve. As far as I know that is still in place. So I wonder if it might just need to be more aggressive.

AnHardt commented 5 years ago

It's still in place. https://github.com/MarlinFirmware/Marlin/blob/bugfix-2.0.x/Marlin/src/lcd/ultralcd.cpp#L948-L951

A bit of debug output in line 950, printing bbr2 and max_display_update_time, should give a hint where the problem is - if there is one.

Kaween-prog commented 5 years ago

Just wanted to add this observation : When using the Fysetc 12864 mini display routines in Marlin (all builds I tested so far, no exceptions) and a Fysetc 2.1 12864 RGB display, the issue does NOT exist and movement is 100% smooth regardless of the state/menu the LCD is in. Just to confirm : yes, on a Delta (Anycubic Predator with an SKR 1.3 and Marlin 2.0) This makes me wonder :

So whatever is affecting the code routines in the full graphic smart controller and the discount variation, it's not affecting the Fysetc 12864's. (tested the 2.1 and 2.0, not sure about the 1.2 .... but as the only difference between those is the backlight color selection system so I wouldn't presume there's much difference there and the 1.2 will probably work the same way).

Maybe this can help a bit to try and trace the issue. As the issue doesn't show up using the Fysetc display code, it would seem logical to presume the problem should be found in the full graph routines ? Would it be possible to use the code for the Fysetcs and kinda backtrack from there, or even use that code as base to redo a test build for the full graph LCD code ?

Also, Bigtreetech has a "hybrid" TFT these days which allows TFT and 12864 emulation (TFT function run over the internal serial port like all other TFT's do, the 12864 emulation runs over the EXP1/EXP2 ports like a "real" 12864 would.) This screen also shows the "stutter" issue when Marlin is configured to use the 12864 reprap full graph routines (and the discount version too ... obviously).

But I find it interesting the Fysetc 12864 mini's do not show the same stutter issue using the same machine, board and Marlin build, and I don't think anyone has mentioned this so far.

Hope this can help people way smarter than me to have a new look at the issue and try to solve it.

AnHardt commented 5 years ago

The Fysetc 12864 mini display uses a completely different display driver chip (ST7567). Probably you have to transfer only half the amount of data and can do that at a much higher speed then with the ST7920 of the RRDFGDC. The ST7920 is really bad! I was a bit shocked when i discovered the new shiny color displays try to emulate exactly that crap.