MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
16.28k stars 19.24k forks source link

1.1.0-RC10 hotbed thermal runaway during G29 #5702

Closed Angel996 closed 7 years ago

Angel996 commented 7 years ago

I've had this start up script for ages. Printer is working fine, both temp sensors are ok.

I flashed 1.1.0-RC10 and now I'm constantly getting a thermal runaway for bed exactly AFTER G29 is complete. Looks like the termal runaway routine is not aware of G29 running and thinks it's an emergency. I've set WATCH_BED_TEMP_PERIOD to 300, but it did not help. Hotbed is in bang-bang mode, using a relay.

Here is the error msg:

> Error:Thermal Runaway, system stopped! Heater_ID: bed
> [ERROR] Error:Thermal Runaway, system stopped! Heater_ID: bed
> 
> Error:Printer halted. kill() called!
> [ERROR] Error:Printer halted. kill() called!

Here is the script:

M80; on
M84 ; disable motors
M190 S105 ; set/wait for bed temperature
M104 S180 ; set extruder temperature
M190 S110 ; set/wait bed temperature
M104 S200 ; set extruder temperature and start calibrating right away
G21; metric
G91 ; relative
G1 F5000 Z5; raise Z
G90; absolute
G28
G29; run auto leveling
G1 Z0; // go to actual zero
G92 Z0.3; the bigger the value, the closer to the bed
G1 F200 Z10; raise Z
G1 F6000 X60 Y100; move to almost center
G1 F200 Z5; lower Z
M84; disable motors
M109 S230 ; set/wait extruder temperature (prevents bleeding)

Thanks.

lukeskymuh commented 7 years ago

It is the same as in https://github.com/MarlinFirmware/Marlin/issues/5698

Angel996 commented 7 years ago

I've seen that thread. No, I don't think it's the same, since I am not getting a ERR: MINTEMP BED error message. I am getting a thermal runaway error.

At the moment I don't have a LED on my hotbed, I will install one and update the thread with info on when exactly it does turn off.

ghost commented 7 years ago

That's funny....there is no RC10...yet.

lukeskymuh commented 7 years ago

ERR: MINTEMP BED is what is saying on the display. Your error message is probably in a software. As awork around you might try to set the G28 and G29 before the M190.

Angel996 commented 7 years ago

lukeskymuh, it also should report MINTEMP error in Pronterface log, it does not.

Now, here's the weird part. It turns out, thermal runaway is not the CAUSE, it's a consequence. I disabled #define THERMAL_PROTECTION_BED. I get absolutely the SAME effect: while G29 is running, the bed turns off, however it turns back on once the print is started since the printer can resume, for thermal runaway error is not thrown.

Somehow G29 is shutting the bed down. I will keep experimenting and update the thread once I find out the cause.

manianac commented 7 years ago

You should probably upload your configuration files. I can confirm bed temp holds fine with running a 10x10 grid with G29 on a mega2560.

First guess is a memory overflow condition, since disabling thermal protection still triggers it.

Angel996 commented 7 years ago

Now, I'm sure this is the same issue. I have another 3D printer. Basically, electonic setup is the same, but endstop polarity and stepper coefficients are different.

With the newer version RC8 I am not able to print at all. The print is aborted at about 1-3% of completion. 3 times in a row printer just stopped. In one case it restarted from the very start (on top of already printed print), the other two times it got in a weird state where it would stop, not get killed, but Pronterface log window would enter an infinite scroll loop that I had to terminate it via Task Manager (Windows 10).

I disabled bed and hot end thermal runaway protection. Now, the print got aborted again with:

Error:Printer halted. kill() called!
[ERROR] Error:Printer halted. kill() called!
Error:MAXTEMP triggered, system stopped! Heater_ID: 0
[ERROR] Error:MAXTEMP triggered, system stopped! Heater_ID: 0

Error:Printer halted. kill() called!
[ERROR] Error:Printer halted. kill() called!

I reset the printer immediately and checked extruder temperature. It was 219 C. MAXTEMP was never reached. HEATER_0_MAXTEMP is set to 250.

I really would like to emphasize the fact that this printer is 100% ok. It's been runing fine for ages too. It was running version 1.0.2.

Blue-Marlin commented 7 years ago

If you use any kind of bed levelling - try to use less than 5x5 points to probe, to make memory problems unlikely. If that works you can increase the number of points again - step by step.

Angel996 commented 7 years ago

I am using 3x3 on this printer. And using 2x2 on previous printer which works. I will try right now, thanks.

Angel996 commented 7 years ago

Modified to 2 x 2 bed leveling. Got killed again. This time "kill button" that I never pressed. (I have changed kill() reason strings in source code to know the kill cause.)

Is the board picking up noise from steppers/heaters? But it never used to! Never had such probs with ver 1.0.2.

ghost commented 7 years ago

How about removing the display and try again.

Angel996 commented 7 years ago

I have downloaded the RCBugFix version, now the printer runs fine. No sudden reboots, no "KILL BUTTON" occurances, not Pronterface freezes. I used the same config file, just had to update some variable name as compile-time err msg suggested.

Angel996 commented 7 years ago

Ok, I've used the RCBugFix version for about 4 hours now. There is still a problem with temperature monitoring. Once I had bed MAXTEMP (although my actual bed is aluminum and it can't possible reach that temp). Also, sometimes I get cold extrusion prevented errors. I had cold temp at 200C, I changed it to 170C (default), now it's much better, but after 3 hours of printing I am still getting it from time to time. Actual extruder temp is around 220C.

lukeskymuh commented 7 years ago

I also get MAXTEMP and MINTEMP all the time. Specially when calling G29. Even when commenting: //#define THERMAL_PROTECTION_BED // Enable thermal protection for the heated bed

Angel996 commented 7 years ago

Thermal protection is runaway protection, not min/max temperature check. That's a different feature, evidently.

billyd60 commented 7 years ago

I am using a taz 5 with and e3dv6 extruder and I am having similar issues to Angel996. I am not using any sort of bed leveling but I am using lin_advance and firmware retract. Using rcbugfix.

What happens is my extruder will work fine for a few hours of a print and then it will randomly drop several degrees suddenly during a print and then slowly heat back up to temp. Then after a few more hours of this behavior the printer will simply stop printing without an error code. The display appears normal, the bed and extruder remain heated, but the printer will not accept commands. A cold boot is required to get it to respond again.

It almost seems like the code is losing track of the extruder temperature for period of time but reporting an old (good) temperature value, and then suddenly sees the actual temperature has dropped way down and then reports the correct temp on the display and takes action to heat the extruder again. I am not a code guy and I am certain it is not my hardware since this behavior does not occur with prior versions of Marlin. It's almost as though the subroutine controlling the extruder temperature is getting lost or ignored. Eventually it just crashes the printer. Just my theory.

Angel996 commented 7 years ago

Yep, exactly the same here. Also had this once when the printer just stalled w/o the error code with bed and extruder heated, Marlin running (heater leds flashing as usual) but host software not being able to send any more commands to it.

Have you tried enabling/disabling #define ENDSTOP_INTERRUPTS_FEATURE? I think it has some effect on these errors.

billyd60 commented 7 years ago

No but I will give it a try. I wonder why that has a positive effect?

The really weird part of this is it doesn't seem to happen when I print with PLA and have the part fan on. It's only happening in ABS prints with the part fan off. Which is crazy and makes no sense to me at all.

The other strange part is it sounds like it happens to you early. For me it takes hours of printing before this behavior shows up. Have you redone your pid optimization? Mine were changed quite a bit after installing the latest firmware, which surprised me. After I redid the pids I got less thermal runaway problems (almost never happens now) but the odd extruder issue I described then showed up.

ghost commented 7 years ago

Are you guys sure you have the correct thermistor(s) selected in Marlin?

Angel996 commented 7 years ago

Tannoo, well, it's the same type that I used with older versions of Marlin. It's -1. I have makergear hotends on both printers, they came with thermisters.

billyd60, I haven't done PID optimization with new versions of Marlin, however. I'll try it.

billyd60 commented 7 years ago

I used the same number I've used in previous versions did they change the thermistor numbering?

Angel996 commented 7 years ago

Still having thermal runaway for bed problem. Only on printer which uses bang-bang mode for the bed. Another one using PID mode for the bed runs fine. I have noticed enabling BED_LIMIT_SWITCHING reduces probability of this error to some extent.

Both printers now run RCBugFix version dated Jan 14 2017.

Grogyan commented 7 years ago

I suggest checking your thermistors with a multimeter, it is quite possible that they are damaged

Angel996 commented 7 years ago

Grogyan, do you mean the thermistor goes crazy exactly while G29 is running? :)

Grogyan commented 7 years ago

Yes. It is possible the thermistor is partially damaged, and the vibrations during axis movement causes the internal structure of the thermistor to create intermittent connection, or disconnection. Typical glass bead thermistors are notorious for breaking.

billyd60 commented 7 years ago

Did you ever redo the pid optimization? That got rid of my thermal runaway problems. For some reason this newer version changed my pids quite a bit, for the bed and extruder. I didn't think that should happen but it did.

I am going to swap out my extruder thermistor because it must be the problem. Just a coincidence I guess that the problem started up when I switched to the newer firmware.

Where do you get the January version of rcbugfix? Or is that just not public?

Angel996 commented 7 years ago

Grogyan, it would really be a good idea to read through the entire thing before answering. I have made it quite clear the problem came about ONLY whilte running the G29 command, not while axis movement or printing. Also, disconnection/shorting of thermistor would result in BED_MINTEMP/BED_MAXTEMP error, not in thermal runaway error. Thermal runaway means actual temperature is NOT changing while it should. Thermal runaway is meant to troubleshoot either the FET running the hotbed or thermister falling off and not reading the temperature correctly.

billyd60, I am not using PID mode on that printer. I use bang-bang mode. So, it does not make sense to do PID optimization.

billyd60 commented 7 years ago

Sorry missed that

Grogyan commented 7 years ago

You made the assumption that I have not read the entire thread, where as I have. Only two things could be at fault Poor PID, extruder PID, as you aren't using PID on bed Damaged thermistor

Having already had experience with thermistors breaking, even with intermittent faults, my attention steers towards this in this case. After that, if redoing extruder PID doesn't fix it, and the thermistor has been replaced on the extruder and bed, there could be a problem with the ISR, unlikely, but possible. IF not problem is identified in the ISR, then the AVR might be defective.

JohnOCFII commented 7 years ago

Only two things could be at fault Poor PID, extruder PID, as you aren't using PID on bed Damaged thermistor

I'm having a similar issue with RC8-Bugfix as well. I assumed it was a physical problem, but I decided to roll back to my previous working firmware, which is RC3. No problems when using RC3. I will try to test again and gather more information.

Angel996 commented 7 years ago

Grogyan, I have drawn a conclusion that you have not read the entire thread, because I stated in my first post, that it was BED thermal runaway error, not the hot end. Why are you mentioning extruder PID then?

I have also stated the printer ran perfectly fine with older firmware. In fact, now, after a thermal runaway error if I restart the printer and restart the print job with bed already heated, it runs fine thru the G29 stage and also up to the very end of print without any errors.

It is clearly a software problem. It is related to the latest firmware. It was fine with the previous version. Why are you trying to persuade me otherwise?

Grogyan commented 7 years ago

As stated, and literally putting my 2c in, I have read the whole thread. A 3D printer is a complex electro-mechanical system and with my general experience with glass bead thermistors failing. I cannot express the literal hundreds of threads i've read of heating problems, due to a 50c thermistor, hot end and/or bed. A few due to defective SSR or FET. I haven't noticed much change in the ISR routine of late, might have missed something. Have had similar problems in the past, and currently, due to induce EMI into the circuitry.

If you are certain that it is firmware, then that is a good thing. I'll shut up :-)

Angel996 commented 7 years ago

See, it does not make sense to blame the thermistor if the printer runs fine in other scenarios. The problem only occurs while G29 is running. And ONLY if the bed has to heat all the way from cold. Never had a problem while printing. Never had a problem after a previous print (if bed already hot). Being a programmer myself, I cannot explain this by a thermistor failure.

I have not looked into code (maybe I should), but I think what happens is the thermal runaway routine is not aware time is passing while G29 is running. So it treats these 7-10 secs as a 0 sec interval. That's what I think is happening. Thus, the thermal runaway algorithm malfunctions because its idea is monitoring of temperature change vs time. And time is "warped" in this case.

lukeskymuh commented 7 years ago

You have already explained that the error is not the same. But maybe the root cause is linked to the temperature spike I have observed on mine. I added some debug outputs to the thermistor file which might help to find your problem. I could imagine that this is caused by the quite complex oversampling code. I have also posted a work around for my problem which is working. (it is not a real solution). I hope this helps you. All files, information here: https://github.com/MarlinFirmware/Marlin/issues/5698

Angel996 commented 7 years ago

I just disabled thermal runaway feature for the bed, that's my workaround. :)

thinkyhead commented 7 years ago

Is anyone using LIN_ADVANCE (other than @billyd60)? Try turning it off and testing again.

JohnOCFII commented 7 years ago

I'm not using LIN_ADVANCE

thinkyhead commented 7 years ago

This may yet be fixed by #5829, which deals with spurious temperature-related issues.

JohnOCFII commented 7 years ago

Thanks @thinkyhead. I'll pull down a fresh RCBugFix and give this a test. It might not happen until the weekend. Darn life interfering with the fun stuff again.

TAz00 commented 7 years ago

I'm having the exact same issue. I can set a temperature for the bed, and hold it fine, thermistor working great. Until I start printing. At some point after brimming, the bed heater led comes on, and stays on, so the bed just climbs way above the target temp. Currently just setting it at 32.

I've done bed pid tuning and tried BED_LIMIT_SWITCHING aswell.

Now I've tried the RCBugFix with same config, and its the exact same result.

bedtempclimb

Beware this is also my first printer build, but other than the bed issue it prints superb. (So I also fried the 5v regulator with incorrectly placed endstops and fixed it with the 5v rail from the atx)

edit

Not using LIN_ADVANCE Bed is 300x200mm

Sebastianv650 commented 7 years ago

I'm not shure if your error is related to the one described here earlier. The original problem here was a error message / printer shutdown that happened more or less instantly while the temperature has not significantly changed in reality. Your problem is a hanging (always 100%) bed heater, that realy heats up your bed while (and that's intresting) the hotend temperature is still maintained by Marlin.

As far as I know, Marlin doesn't make a real difference between hotend and heated bed. Does anybody of the coders here knowing the temperature-code has an idea what could happen here except a very strange hardware failure?

@TAz00 can you repeat that issue with an older RC of Marlin, for example RC7?

TAz00 commented 7 years ago

I think the difference is my big bed cant get hot enough to actually trigger the overheat warning, it has never actually stopped, ive just noticed it going up during my initial prints.

Here's a screencap where i told it to hold 45 for the extruder and 40 for the bed, which it does nicely.

bedtempholdingfine

*edit Ive also found that disabling the bed intially, and then turning it on manually midway through the print, leads to the same constantly on runaway.

Configuration.zip [Uploading 20mmX20mmX20mm-hollow.gcode.zip…]()

Sebastianv650 commented 7 years ago

The overheat/mintemp/printer reset was triggered due to wrong temperature readings and not called watchdog (inside the temperature loop), there was no real heat up / cool down involved in the issue described above. But let's see if someone has an idea what could go on.

TAz00 commented 7 years ago

Ahh right, yeah not the same problem as described. Tried RC7 and it's the same, holds temp fine while brimming, then going to constant on when starting the object. I'll try some older version and consider making a new issue.

Angel996 commented 7 years ago

Hi, I'm the original poster.

Indeed, the problem is not exactly the same as mine: I really doubt my bed could climb significantly in temperature while G29 is running: at 110 degrees it heats pretty slowly. But it could also be the opposite: I think while G29 is running, bed thermal routine is not running, so the bed heater could be stuck in either state: ON or OFF. So, I think my case if was OFF. Previous poster case it was ON.

TAz00 commented 7 years ago

Turns out, my heatsinks for mosfet 9&10 were touching. Sorry for cluttering.

Anxles commented 7 years ago

I see there are some doubts about is it a software and hardware problem. By no meens I am a programmer, I even struggle a little with configuration.h to understand all options. BUT I have same problem with RC8, I didn't try M29 but I have problem sa soon as I start the print. I have CORE XY enabled, printer of my own construction on Megatronics V2 board. everything was OK before but I had to make dual Z stepper for torque on Z axis, and I had to update Marlin for that as the old one didn't support that option for my board, I changed all the relevant settings, uploaded firmware, tested all the axis lcd etc (everything looked fine) the problem started when I tried to make a test print. As soon as it started BED temperature became ERRATIC jumping bewteen 400+ C and some LOW (I didn't notice how much) the jumps were sudden a few hunder degrees ) Long story short I checked the thermistor and connections, Changed to new thermistor and result is the same. After looking here and found people with the same problem, I configured RC6 to same parameters as RC8 and so far for few hours it's printing but still I can see some small and very short temp jumps (I have bed 40x40cm 10mm thick so thermal capacity is huge - by no meens possible any sudden temp, jumps). Hope this will be resolved soon as options available at RC8 are very interesting and somehow I dont like Repetier firmware :)

Sebastianv650 commented 7 years ago

This is already solved in the current RCBugFix branch. Please use this and try again.

Anxles commented 7 years ago

I was trying to find that fix as a while compilation which i just download and configure, but without success, I tried by Github search engine and by google, can You provide a link to complete set of files?

Sebastianv650 commented 7 years ago

I'm not sure if i understand your problem. To download the latest code, just click "Code" left to the "issues" tab and switch with the drop down menu to "RCBugFix".