TinkerGnome / Ultimaker2Marlin

105 stars 38 forks source link

Ultimaker 2 w/ FW V19.03.1 #104

Open TECNOTOUR4343 opened 4 years ago

TECNOTOUR4343 commented 4 years ago

After testing this version 19.03.01 in three Ultimaker 2+ all of them have a problem. They are printing the same part that takes two and a half hours to complete, and randomly they stop showing on the screen:

"ERROR - STOPPED" "Go to: ultimaker.com/suppor"

so I had to reinstall the original firmware because it is more stable. It is a shame because I really liked this version, it had more options...Screen and light timeout, scroll names on screeen, the preheat option, information on screen while printing. It this could be fix it I would be very glaad , and if also somebody had information for adding a filament runout sensor it would be fantastic. Thank´s to the comunity for you work.

gr5 commented 4 years ago

I have 19.03.1 on my most used printer. I have tinker version of 18.X and 17.X on 2 other printers that I use a lot. I haven't had this problem.

HOWEVER

What you report is not unique. I recommend taking the printer out of geek mode or put it into TUNE mode early on and it seems to not give the "error - stopped" issue. For me, I leave it in geek mode and it works fine.

My theory is that there is just a little bit too much processing going on for the screen display. If you can supply more data that would be helpful (e.g. maybe try it on just one printer initially).

TECNOTOUR4343 commented 4 years ago

Hi gr5, now I cannot stop the work the printers are doing, the original version is stable. Mabe there could be more Ultimaker 2 owners that can give their opinion. Here I atach an image of a runout sensor that I added to my printer yesterday. (It is working day and night making covid masks)

20200408_221602_copy_1164x582

TECNOTOUR4343 commented 3 years ago

Hello again gr5, I´ve installed again the tinker firmware versions 19.03.1 in two printers, an Ultimaker 2+ and an Ultimaker 2extended+. As you suggested to me I am taking the printers the menu out of geek mode to lower the processor load. After working both printers for several hours I haven´t seen no more "error - stopped" issue. I will continue to use the Tinker version because it is really good and if I have more issues with it I will let you know.

ComNav-Eli commented 3 years ago

While I realize this isn't the most timely of replies, I'll throw my 2c as suggested by TECNOTOUR4343 last year on April 9th.

I've got a UM2+ here at work, I've been running 19.03.1 since it's release - 1630:58 hours powered on, 1309:18 printing and 2168m of filament according to the counters.

I've never turned off geek mode, and I've only ever encountered this error once.

zviratko commented 1 year ago

Another not-very-timely comment - I had this error several times, usually on longer (24h+) prints near the end (which really sucks). It is a very complicated model which I have not simplified. I wonder if too many small steps/command in combination with geek mode is the issue here where the printer simply can't process the data fast enough and this triggers some sort of watchdog timer and it crashes...

gr5 commented 1 year ago

I don't think that's it. There is quite a bit of cpu intensive processing but if that runs slow the printer just comes to a stop, usually very briefly. The planner can plan up to 16 moves in advance and they are always set so that it goes as fast as possible (tries to reach the goal speed) but always has enough moves in the queue so that it can slow down and stop without jerking hard enough to skip steps.

This intensive routine runs all the time but has a lower priority and if the queue gets short it actually has less processing to do anyway and will just print slow if there are lots of short (say < 2mm) moves.

The interrupt service routine - the timer one - however is required to finish before the next interrupt timer. If not I think it goes badly fast, fills the stack with return commands and crashes in unexpected ways. This runs I believe 10,000 times per second. One of my theories is that sometimes this takes too long. I don't know. I'm not happy with this theory.

But I don't think short moves will crash anything - more likely the printer will just slow down when you have more than say 16 moves within say 4mm.

zviratko commented 1 year ago

I can send you the latest gcode that I ran (and which crashed today) and the approximate location where it crashed if that helps. (can we print/dump the line number or at least Z height somewhere when this happens?) I have to wait until my current prints finishes though. I just tried reslicing the same model and I don't think >16 commands per 4mm is unrealistic with it at all.

My guess was too many commands because this print also had lots of errors that happen when the printer is overwhelmed and pauses (blobs and overextrusions on the outside perimeter), not sure if that happens to UM2+. It was one of many prints with mostly the same settings and while all of them had some errors, this one had even more (and it was also the largest job). I also tinkered with it numerous times (temperature, acceleration) and it also uses gyroid infill which I think can cause this with high enough resolution on Prusa printers etc...

Is the "ERROR - STOPPED" error caused by this timer or what does actually trigger it?

Btw I wanted to try the "recover print option" for the first time, and It took minutes to get to just 1mm (and I wanted to recover at something like 200mm). Is it replaying in realtime or is it even supposed to work?

gr5 commented 1 year ago

Yeah it slowly goes through the whole file in order until it finds the right layer (z value). It can take 5 minutes or I suppose quite a bit longer but it is MUCH faster than printing. At least in my experience.

cura does remove some of the points when it slices. The parameter I know about is called "maximum deviation". The larger you make that value, the fewer gcodes you end up with. Although using a tool like meshlab's decimation feature (I'm not certain that is what it is called) will probably do a much nicer job.

Yes the printer can stop a lot and give you blobs if you have too many points/gcodes. But that shouldn't make it crash. I don't know what creates "error stopped". I just grepped through the code and it looks like if there is an error it always tells you the error so I think the only way to get that error is if the variable "Stopped" is set to a random value (and not the allowed values). So I think memory is getting clobbered including this variable and then the code sees an illegal value almost immediately and displays this error.

The only thing I see that might possibly clobber this part of memory are some arrays declared a bit earlier: cmdbuffer cmd_line_buffer These 2 buffers have very known sizes and the code that writes to them normally knows the size - I don't understand how that code might get messed up and write beyond the buffer. Unless some index on the stack gets clobbered. So that would be one error cascading to another which is even harder to debug.

zviratko commented 1 year ago

Hmm, I think you underestimate the gcode size I was printing :-D (my guess is 300MB+), and this was like 95% done. I waited for a while, each layer took maybe half a minute so I gave up.

There are some overflows in the code for sure, I power cycle the printer every other day or whenever I do major maintenance as I have seen it getting confused a few times. For example if I manually poisitioned the print head somewhere near the center, extrude and then try printing, it start going to the extreme left, ignores the stop and makes a lot of very worrying noise. I then either have to power cycle it or just move the head home manually (or via command). Sometimes the Z offset isn't stored when adjusted during printing (or rather gets reset later, not sure when or why). And probably a lot other issues I had that I just resolved by power cycling.

The "ERROR - STOPPED" looks like some sort of watchdog or sanity check, and it doesn't throw any code (I don't think there's any space for it) - it's not like the other error codes... that actually contain error codes you can look up :-)

zviratko commented 1 year ago

Ah, I see it in the code now

if (IsStopped())

    {
                char buffer[24] = {0};

                strcpy_P(buffer, PSTR("ultimaker.com/"));

        lcd_lib_clear();

        lcd_lib_draw_string_centerP(10, PSTR("ERROR - STOPPED"));

        switch(StoppedReason())

        {
...
...
...
                default:

                        strcat_P(buffer, PSTR("support"));

        }

So something just crashed... It would be great to get the gcode line in there, or anything else that might make debug at least possible

gr5 commented 1 year ago

No, "stopped" is a normal occurence if a "real" error occurs like temp sensor bad reading or homing failed. Instead, the variable here: uint8_t Stopped = false;

is set to a non-zero value but also not any of the standard error values. Which is impossible. I'm pretty sure. So somehow code is writing to this variable that shouldn't. Most likely writing to one of those 2 arrays I mentioned and then it keeps blasting through memory another dozen or so bytes until it clobbers "Stopped" and it checks that variable (Stopped) in the code you show above hundreds of times per second and will quickly notice that it isn't zero and display the error. I think it maybe keeps going though? I don't know if it actually, truly stops, unless the clobbering breaks something else at the same time.

That's the only thing I can think of. One could edit that code above and if none of the default reaons happen - that code could display more valuable information on the LCD display such as starttime, stoptime to see if those values look like they got clobbered as well, along with some other key variables that might hint at what the issue is. It could be that if you look at those 3 variables (1 + 4 + 4 or 9 bytes) they might be the ascii characters of gcodes or they might be some other hint at what that data is that is clobbering things. If my theory is correct.

gr5 commented 1 year ago

@TinkerGnome - I'd love to hear your thoughts.

gr5 commented 1 year ago

Actually I think it loops that code a mere 20 times per second.

gr5 commented 1 year ago

wait - does it say something about "support" below where it does the "ERROR - STOPPED" message?

zviratko commented 1 year ago

Yes it does. Sorry if I didn't make that clear. There's no real error code though which is why I think it's the "default" case in that function I pasted.

When I came to the machine it was missing the top few layers and there was a blob of ooze where it stopped. Not sure where the head was, it might have been home but I'm not sure.

gr5 commented 1 year ago

Yeah so my theory still stands. An illegal value ends up in that variable somehow. Which should be impossible. So there is a corruption of memory somehow. And my bet is one of those 2 arrays defined before that variable.