Closed mignolo4 closed 5 years ago
You may have the display controller configured in such a way that it takes a very long time to update the display, in which case the planner could experience starvation. However, Marlin times the duration of display updates and throttles back the frequency if it detects that planner starvation is imminent. So the display update time would have to be pretty extreme for planner starvation to occur.
It would help to have your configuration files, as requested in the issue template. Please put them into a ZIP file and drop them on your next reply.
Would also be interesting to know if moving slow stutters more than moving fast - much faster.
My suspicion is you are moving slow. Because of segments_per_second the sub moves become small - very small. Much smaller than min_steps_segment. So only every now and than a valid, large enough, segment pases the planer and enters the planer buffer. Because the algorithm for delaying the display update needs an exception, (It does not delay when the planner buffer is completely empty), and the buffer is almost always empty, it has no chance to fill up because of the time consuming display updates. Also the time needed to step, a that short segment, is too short to get the next valid one planned.
Solution would be to limit the shortness of the sub-moves to about min_steps_per_segment
.
https://github.com/AnHardt/Marlin/commit/793a50bc6ff42f5e6ae38921c10478d387cae0fb
can limit the size of the subsegments to about the right magnitude. The smallest segments will still be too short. But two of them will give always a usable block, while now it's possible, if slow enough, to need 20, 200, 2000 or more subsegments to make a usable block.
Even when you switch off the display you currently get lots of to short unconnected moves, but the valid ones will be evenly distributed in time, without the breaks for the display update.
Just a theory!
ADVANCED_OK
can give the answer - the state of the planner (and other) buffer.
edited: added picture + video, look at the bottom!
Hi, first of all, thank you for your replies.
@thinkyhead: I'm sorry Scott but I think there's not so much to see in my configuration files because, as I said, they are almost standard! I download the lastest release, change 4 parameters for a quick test on my delta (board, endstops, thermistors, and obviously the lcd); that's all.
What's not standard is the way I make the lcd work.
Because if I only enable the graphical lcd
#define REPRAP_DISCOUNT_FULL_GRAPHIC_SMART_CONTROLLER
everything works as expected (but I can't see anything on the display)
Then, when I change this in the ultralcd_DOGM.h file:
#elif ENABLED(U8GLIB_ST7920)
// RepRap Discount Full Graphics Smart Controller
#if DISABLED(SDSUPPORT) && (LCD_PINS_D4 == SCK_PIN) && (LCD_PINS_ENABLE == MOSI_PIN)
#define U8G_CLASS U8GLIB_ST7920_128X64_4X_HAL
#define U8G_PARAM LCD_PINS_RS // 2 stripes, HW SPI (shared with SD card, on AVR does not use standard LCD adapter)
#else
#define U8G_CLASS U8GLIB_ST7920_128X64_4X
#define U8G_PARAM LCD_PINS_D4, LCD_PINS_ENABLE, LCD_PINS_RS // Original u8glib device. 2 stripes, SW SPI
//#define U8G_CLASS U8GLIB_ST7920_128X64_RRD
//#define U8G_PARAM LCD_PINS_D4, LCD_PINS_ENABLE, LCD_PINS_RS
the lcd starts to work but the movements start to stutter!
@AnHardt I understood 20% of what you said :-) because I'm not a programmer or similar but only a passionate guy following this amazing project, sorry! Anyway, I gave a glance at your commit but I noticed the code has changed from that time and I was not able to insert those lines in the last versions. Concerning the speed I don't think I'm moving slow because the stutter become worse and more visible if I increase the speed (if this was what you meant).
I am available for any test or solution you suggest me.
Here the "quality" of the print with the stutter:
... and a video showing THE stutter :-) https://youtu.be/SYVsJR98xgk
@thinkyhead
A little big step in the right direction (I suppose):
I tried more or less 30 different versions going back in the past and Marlin-604b804125571782a11ff819b29a062b23879ba0 (27sept2017) doesn't show the issue!
I'm going to find out when excatly the problem was introduced! Do you think it's a good idea or I'm wasting my time?
@mignolo4: Usually if you can find the commit in which something started going bad, it's a great first step. However, over a year ago.... that's a lot of commits to look through!
It's definitely something happened from 27th sept to 7th oct 2017! Going back to check, uhm, not really, going to bed now! :-)
Interesting. If you can narrow it down to a specific day, even better!
@thinkyhead
I think I found it! The last version working well with my configuration is commit: https://github.com/MarlinFirmware/Marlin/commit/604b804125571782a11ff819b29a062b23879ba0 If I understand correctly how commits work, the one came immediately after was https://github.com/MarlinFirmware/Marlin/commit/88f9194168c120093cc271689be0578307c1edca
... and that one last commit introduced the stutter!
Now it's up to you Scott to spread the magic!
I’m glad to hear you found a point in time that works. The referenced commit would not have had any effect, so we’ll have to look at the general timeframe.
Does it help if I check one by one all of the commits of that day to see if some of them work well? So you can focus on the others?
EDIT: checked some more commit of the same day and only https://github.com/MarlinFirmware/Marlin/commit/c869dc97452ed78c6fcf4f877a40afc8eaa49c45 is SMOOTH.
To check my theory I went back compiling some versions of 26th and 24th Sept expecting smooth movements: surprisingly enough I find out that those versions are NOT smooth again... so now I'm confused. I think there's not much more I can do with my knowledge.
After a lot of time I'm here again to add some information. I had the possibility to try another LCD, non graphical this time (RRD Smart Display aka LCD2004) with the same configuration and it works flawlessly without any problem. Any news on that issue?
@mignolo4
This Issue Queue is for Marlin bug reports and development-related issues, and we prefer not to handle user-support questions here. (As noted on this page.) For best results getting help with configuration and troubleshooting, please use the following resources:
After seeking help from the community, if the consensus points to to a bug in Marlin, then you should post a bug report.
Before posting a bug report please test with bugfix-2.0.x
to check if problem is gone
@boelle sorry, I know I'm not expert at all but honestly I thought that was (is) a BUG and I was trying to be helpful in some way
uppps... i just saw the question label
but is the bug still there in latest bugfix 2.0?
:-) really don't know, last time I tried was one month ago; I'll try again soon
mabe close this one and open a new when you have tested?
Just tested with no good news :-( Do you think it's better to start a new issue for a better visibility?
nope, we just keep this one open
I bought a new SKR v1.3 board (LPC1768 32bit board) and the stutter is still there (again: with a delta config). So now is quite obvious that's a display issue.
edit: I can give you guys a new information I've never noticed before: The stutter is there only in the main screen! It seems that when the screen updates with new coordinates of the axes the stutter appears. Everything is buttery-smooth if I enter the menu!
Here is a video with the slowmotion function because it was the only way to see it in a video :-) Pay attention at what happen in the lower-left corner in the left motor https://youtu.be/SgVZHnPevCo
@mignolo4 Are you still using a modification to get the display to work? If so did you try it with just the standard configuration? If you are still using that modified config, please post the details of the display you are using. Please don't just say it is a FULL_GRAPHIC_SMART_CONTROLLER because we need to know exactly which one it is, there are many different manufacturers of these displays. Please post pictures of the boards and a link to the supplier if possible. Many, many people are using the SKR board with a REPRAP_DISCOUNT_FULL_GRAPHIC_SMART_CONTROLLER with no problems (I have two sitting here in front of me working fine), so we need to identify why yours is different.
@gloomyandy no modification with the skr1.3 (only the display define) since it has everything you need onboard for the lcd, so exp1 and exp2 connected directly to the lcd and even working with 2 (two) meters flat cable :-)
Before I write the info about the lcd board... are you using a delta configuration? Because only with that one the issue occurs!
edit: pictures added
So just to be clear your display works fine with the standard configuration with the SKR board and a recent build of Marlin? You are not making any changes to the U8GLIB settings (as you reported you had made before). The only display related change you have made is to define the display type? What have you got it set to?
When did you download the version of Marlin you are using on the SKR board? I assume you are using standard Marlin not the version from Bigtreetech?
You should probably upload your current configuration files for this new board.
To be clear: a week ago I compiled the very last bugfix version of marlin (not the bigtreetech fork), I chose the delta config from the examples, I defined the "#define REPRAP_DISCOUNT_FULL_GRAPHIC_SMART_CONTROLLER" and the other settings to use the skr board. No other modified setting or files ;-)
But... as I stated before, it's not a SKR problem, my previous board was an Arduino2 !
I'm not suggesting that your problem is SKR related. But your previous report contained changes to files that may have been considered to contribute to the problem. If you are no longer making those changes then the issue will hopefully be clearer.
@gloomyandy Well, seems this is really looking similar to what we experience on our Alfawise and STM32 ( see #12403 ) !! i will go into the menu as written above and check if we get also the effect, as we get it in the main screen.
@hobiseven Yes I was thinking that! The fact that when in a menu there is no stutter is probably not surprising, I suspect that there is no screen update in that case (or at least the update is different). When in the home screen the display is being updated to show, x,y,z temperatures etc.
@gloomyandy We are now starting to replace parts of the code / LCD updates with DMA accesses. I will measure tonight the possible speed improvements. I now have a really simple logic analyser trigger, allowing me to detect planner starvation : No active clocks on X and Y for more than 2ms... We will try different DMA combinations. Our LCD is a 320x240, and to make a 2x zoom, we have to duplicate data, which also takes time. We will keep you updated on the progress. One thing is clear : the commited MKS robin code in marlin 2.0.X does not work properly as is, for 150mm/s prints.
@gloomyandy Well, I have some bad news... We have cleaned up and speeded up our display code as much as we could using DMA running in parallel to the CPU, and we have pretty much the same result. As I now have a very simple trigger condition on the logic analyser, I have a very robust test that give me reliable results : No clocks on X+Y for more than 15ms > Trig. I have a test gcode which is a large flat washer, to avoid Z movements, and have only circular motions. With that gcode, with all improvement, or no improvements at all on the LCD ( DMA or no DMA), we have about 60 "jumps", or clock holes in a few minutes. Same as before, 16.25ms long.
We then removed all the display code and touchscreen code, we removed the second serial port, as well as made sure to remove all compiler debug flags > We still get 5/6 clock holes per gcode print. Please note that sometimes it is less. The printer is controlled via octopi.
Do you know any of the people using marlin 2.0.X that could check whether they have similar jumps on ST32 or other CPUs..., using a logic analyser? Our board is a STM32F103VE @ 72 Mhz.
@mignolo4 This is a test you can try. @pinches This is an information that you might find intresting above. You might want to replicate the logic analyser test.
I work with @tpruvot who really made a very clean LCD code, trying to optimize / remove un-needed code. We would need some sort of code profiling tool I think, to find out what is going on.
Unfortunately all my boards are LPC176x based so I probably can't help. What I'm trying to understand here is what is it that is stopping the main stepper interrupt from running. That should be the highest priority of all (or at least very close to), so something must either be running at a higher priority or must have disabled interrupts. You could perhaps add code into the LCD refresh routine to see if it is ever called with interrupts turned off, I don't think it should be. We may then be able to track down what is going on.
Hmmm, well, what is strange is that even with all the LCD code removed, we still see some of those clock s stopped. What I am thinking is that the ISR might still run, but there is no step command passed to the ISR. We are getting back to place a gpio pin in the ISR routine... I will do that. One question : I was looking for a high level architecture document of Marlin code. Does this exists anywhere? Your boards are LPC176x.. Well, it would be quite intresting for you to hook a LA on that, and check is you have any of those strange "holes". You actually do not even need an LA, you can feel the machine vibration by hand. as soon as those clocks are missing, of course the machines vibrates a bit more. Ok, I will place a gpio ISRback on the Zmax pin. Let's see.
It may not be the LCD code that is the problem. It could be that some other part of the code is turning off interrupts but then calling code that eventually updates the LCD (but with the interrupts off), it may be that most of the time the interrupts are only disabled for a very short time (so no problem), but from time to time the code path is such that the LCD update gets called which takes a long time with the interrupts disabled (which is a problem).
Well, we will try to debug this! Let’s see where we end up with gpio
@gloomyandy Regarding this thread, it is closed for me, as we have prooved that the gaps in the steppers clocks is NOT caused by the LCD code. it is simply aggravated. It is either a lack of CPU time, which I doubt, or an overall schedule issue in the code. As I said, we will continue debugging, and look at why the step pulses disapear, but this is not directly an LCD code issue. But the bug remains.
Thank you for your help.
@thinkyhead Sorry but I do not understand the reason why this issue has been closed...
Yesterday I tried the most recent bugfix version but the problem is still there. I see other people on facebook and on some forum having this issue too.
EDIT - some new clue:
I had problems with bad stuttering too. Not sure if it was the root cause, but when when I disabled ADAPTIVE_STEP_SMOOTHING in Configuration_adv.h the stutter went away; at least as far as I could tell. Prints well now.
It's kind of weird because before when using a MKS Gen L board I didn't have the same problems with ADAPTIVE_STEP_SMOOTHING
Already tried that and it's not my case :-(
@hobiseven regarding high level code architecture. I made a very rough code overview for my own purposes a year ago. It's probably not what you need, but here is the link anyway: https://github.com/DerAndere1/Marlin/tree/Marlin2ForPipetBot/docs
@mignolo — You can get less demanding status screen drawing code by switching off all the fancy additions to the status screen and using the single header image instead of the individual nozzle and bed bitmaps.
If you are using 32x micro-stepping on any of your axes, go down to 16x micro-stepping.
The general problem is not that the screen drawing blocks the stepper ISR. It doesn’t interfere with it. However, screen drawing temporarily blocks the G-code queue from feeding the planner fast enough, and when the queue starves we get stuttering movement as the machine stops and starts.
There are various other things you can also try:
That’s all I can think of off the top of my head….
The good news is: On the 32-bit boards the time used for 'drawing' the screen becomes less and less important.
The bad news is: Even on the AVRs, the time for 'drawing' the screen is lower than that for 'transfering' the data to the display.
The worst news is: We already send the data to the ST7920 as fast it can read them. We will never be able to improve this. Except by reducing the amount of data, like we try to do with LIGHTWEIGHT_UI
for the status screen.
Stars at the horizon: We could try to shift away the load for the display updates, away from the main process(or) to DMA or external processors (intelligent displays handling the complete UI).
U8g2 now seems to be able to transfer only 'dirty' areas of the screen. That could decrease the amount of transferred data in the statusscreen (not in the menus where the complete screen contend changes with every encoder event - but in editing a value).
the time for … 'transfering' the data to the display
Yes, this is not something we can do much about with the current u8g library, since each SPI bit requires a small delay, and the transfer is bound to the main loop. If it were possible to rewrite all of Marlin's SPI code and all the external libraries, then we could perhaps do all the SPI transfers in a coordinated interrupt. This would allow us to get back to managing the queues more quickly. Someday, perhaps…
Also... As a brute force fix to give the main Marlin code more processing cycles... You can change
#define LCD_UPDATE_INTERVAL 100
to 250 ms. The LCD encoder wheel will feel a tiny bit 'sluggish'.
But the main Marlin loop (with the important code executing!!!!) will have
way more time to process stuff.
That's definitely going on the Troubleshooting > LCD / Controller section of the site. The amount of cycles freed up is going to depend on what's happening in the LCD code. There are some general housekeeping tasks and I'm sure the time adds up when doing them 100 times a second.
100 times a second
Of course. But it's every 100ms (100 * s/1000), so 10 times a second.
see https://github.com/MarlinFirmware/Marlin/pull/14595/files#diff-54ada0858fc4ed0c0dfc06aa12378c1dR834
its doable to improve the menu refresh while scrolling... and let the status screen as is... while printing
At one point @AnHardt worked out a very good throttling mechanism that would prevent the screen from updating if it looked like the planner could starve. As far as I know that is still in place. So I wonder if it might just need to be more aggressive.
It's still in place. https://github.com/MarlinFirmware/Marlin/blob/bugfix-2.0.x/Marlin/src/lcd/ultralcd.cpp#L948-L951
A bit of debug output in line 950, printing bbr2
and max_display_update_time
, should give a hint where the problem is - if there is one.
Just wanted to add this observation : When using the Fysetc 12864 mini display routines in Marlin (all builds I tested so far, no exceptions) and a Fysetc 2.1 12864 RGB display, the issue does NOT exist and movement is 100% smooth regardless of the state/menu the LCD is in. Just to confirm : yes, on a Delta (Anycubic Predator with an SKR 1.3 and Marlin 2.0) This makes me wonder :
So whatever is affecting the code routines in the full graphic smart controller and the discount variation, it's not affecting the Fysetc 12864's. (tested the 2.1 and 2.0, not sure about the 1.2 .... but as the only difference between those is the backlight color selection system so I wouldn't presume there's much difference there and the 1.2 will probably work the same way).
Maybe this can help a bit to try and trace the issue. As the issue doesn't show up using the Fysetc display code, it would seem logical to presume the problem should be found in the full graph routines ? Would it be possible to use the code for the Fysetcs and kinda backtrack from there, or even use that code as base to redo a test build for the full graph LCD code ?
Also, Bigtreetech has a "hybrid" TFT these days which allows TFT and 12864 emulation (TFT function run over the internal serial port like all other TFT's do, the 12864 emulation runs over the EXP1/EXP2 ports like a "real" 12864 would.) This screen also shows the "stutter" issue when Marlin is configured to use the 12864 reprap full graph routines (and the discount version too ... obviously).
But I find it interesting the Fysetc 12864 mini's do not show the same stutter issue using the same machine, board and Marlin build, and I don't think anyone has mentioned this so far.
Hope this can help people way smarter than me to have a new look at the issue and try to solve it.
The Fysetc 12864 mini display uses a completely different display driver chip (ST7567). Probably you have to transfer only half the amount of data and can do that at a much higher speed then with the ST7920 of the RRDFGDC. The ST7920 is really bad! I was a bit shocked when i discovered the new shiny color displays try to emulate exactly that crap.
Hi, after I found the way to make the LCD work (see issue #12294 ) I 'm experiencing some stutters whem moving x-y axes on my self-made Delta-style 3dprinter (RADDS +FULL_GRAPHIC_SMART_CONTROLLER).
If I send a simple movement like
G1 X-140
I can clearly see that the movement in not smooth. edit: video added - you can find it after the picture I posted belowI tried to narrow down the problem and I found that the movement is smooth if:
I change DELTA_SEGMENTS_PER_SECOND to something lower than 40
I deactivate the LCD (the movement are smooth even with DELTA_SEGMENTS_PER_SECOND 300)
the movement is only in the Z direction
Already tried:
to restart from a fresh installation with default values and modifying only 4 essential parameters: board, endstops, thermistors, and obviously the lcd
old releases of bugfix 2.0 (I stopped trying with the 30 september release)
https://github.com/MarlinFirmware/Marlin/issues/6703
https://github.com/MarlinFirmware/Marlin/issues/7677
No problem at all with other firmwares! Am I doing something wrong?