MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
16.18k stars 19.21k forks source link

Temperature Spikes (Heatbed) #5805

Closed bamkrs closed 6 years ago

bamkrs commented 7 years ago

I recently switched from 1.0.2-2 to 1.1.0-RC8 and confirmed it on RCBugfix 1e4d4e5. Repetierhost and Server report some temperaturespikes to 1000-7000°C on 1.1.0-X. When I switch back to 1.0.2-2 the spikes are gone. Any clues? (Yes, I double checked my configuration.h) screenshot 2017-02-11 18 33 51

System: RAMPS 1.4 with the 100k beta 3950 1% thermistor (4.7k pullup) connected to T1.

Blue-Marlin commented 7 years ago

Does switching off ADVANCE and LIN_ADVANCE improve your experience?

bamkrs commented 7 years ago

@Blue-Marlin Why should do ADVANCE and LIN_ADVANCE (Extruder Settings) do anything with my Heatbed-Temperature-Reading? I'm confused. But: They where disabled the whole time. Anyhow, thanks for the fast reply.

Blue-Marlin commented 7 years ago

with my Heatbed-Temperature-Reading?

Because i'm looking for a proof, that multiple instances of the temperature interrupt can run at the same time when one of the ADVANCE's is defined - what should not be. Temperature spikes could be one of the symptoms. Thanks for your answer - even if i have now to search for an other reason to your problem.

bamkrs commented 7 years ago

@Blue-Marlin sounds reasonable. I've activated ISR for my Endstops, but they never get hit mid printing. So that won't be an issue. I'm an embedded dev myself. ISR are as good as they can be bad if not programmed carefully. I know the pain.

Bob-the-Kuhn commented 7 years ago

Just some ramblings about possible causes ...

I wonder if 1.0.2-2 is more sensitive to noise than 1.1.0-RC8. They both use the same oversampling rate of 16. I can't think of anything else that might be different.

I suggest moving your thermal probe wires to see if that helps.

Looks like the only spikes are in blue. Is blue your heat bed?

Spikes from 10 seconds to 2 minutes apart. The heat bed power is the only thing I can think of on a print that could change in the same time range. Everything else is much faster except maybe the Z axis on really wide print.

1000-7000°C spikes - IMPRESSIVE
I'm assuming you're using thermistor table 10. To get a reading in that range the input voltage to the ADC would have to be within a few millivolts of ground.

Are these spikes killing off your print jobs? As best I can tell, once the 16 samples are taken then if the average is above your MAXTEMP setting (just once) then it'll kill the print (and maybe the entire printer).

If it is killing off your print jobs then we must assume that there is a real problem.

These 16 samples occur over a period of approximately 1/2 second (assuming two thermal inputs) so that pretty much rules out cross talk from other cables.

I'm wondering if there is an intermittent short of the signal to ground. Could be the sensor itself, could be a thermal thing (does it happen at other target temperatures?) , could be a mechanical thing where it only happens if the X carriage is at xxx and the Y at yyy

bamkrs commented 7 years ago

@Bob-the-Kuhn I already tried to narrow it down but no success. There I can't determine a condition where it begins to Spike. My first guess was a bad wiring. I pulled on my thermistorwire, separated it from the heat bed wiring and even tried a different thermistor. No luck. The only thing I can tell it that it needs some time before it begins spiking. If I print a 20x20x20mm Calibration-Block at 0.2mm layer high, it starts somewhere around Layer 40-50.

It never killed my print so far but I don't want to wait for it to kill a long print. Spike over 5000 Here you can see a Spike up to 5220°c

Periodically And this one is from yesterday where it spiked periodically to exact 777°c (according to Log).

Bob-the-Kuhn commented 7 years ago

I believe your controller has a demon inside it. Take it to a Catholic church and sprinkle it with holy water.

If that doesn't work then please post/attach excepts from the log showing the spikes. Include some of the before spike & after spike portions.

Long shot - try putting a fan near your controller.

bamkrs commented 7 years ago

@bob-the-kuhn it was a bit weird to ask the local church for sainted distiled water but I got it, tried it, no luck. My board is in a case. The case gets its airflow by a 80mm 12v fan at 100% all the time. And with 1.0.2-2 there arent any spikes.

Any special Debug-Flags I should activate for the Log?

Bob-the-Kuhn commented 7 years ago

Try issuing a M111 S2 command. Sometimes it'll report something unusual.

You might also try enabling in Configuration.h the following. It might show something.

define PID_DEBUG // Sends debug data to the serial port.

define PID_BED_DEBUG // Sends debug data to the serial port.

Above is the little I know of logging. You'll have to play with them to see if any are of any value.

Long shot - at the very end of issue #5803 is some code meant to clear up some weird temperature problems being reported by users. This code is meant for the latest version of RCBugFix but most likely is OK with RC8 and later.

thinkyhead commented 7 years ago

Test again with the current RCBugFix when you get a chance, as there have been some changes that could affect temperature readings.

bamkrs commented 7 years ago

@Bob-the-Kuhn I've run different prints now, PID_BED_DEBUG and PID_DEBUG are OK! They don't mention the Spikes (I've scripted a quick logparser to analyze the output...). But it killed a print today! Temperature Run-Away kicked in. Don't know why. TRA only affects temperatures that are to low, or am I wrong?

@thinkyhead Testet 3 Prints. Spiking continues. Even killed a print today. I've flashed a completely clean RCBugFix clone, hammered in my Configuration.h and testet my usual schedule. Switched the Bed-Temperature-Sensor and wiring another time. No luck.

It only happens while printing. If I set the bed temp to 65, it happily stays around 65°C +/-0.3

emartinez167 commented 7 years ago

It could be a short caused by a cable that only happens when the bed is moving. Been there.

Regards,

Ernesto.

On 17 Feb 2017, at 06:14, Benedikt-Alexander Mokroß notifications@github.com wrote:

@Bob-the-Kuhn I've run different prints now, PID_BED_DEBUG and PID_DEBUG are OK! They don't mention the Spikes (I've scripted a quick logparser to analyze the output...). But it killed a print today! Temperature Run-Away kicked in. Don't know why. TRA only affects temperatures that are to low, or am I wrong?

@thinkyhead Testet 3 Prints. Spiking continues. Even killed a print today. I've flashed a completely clean RCBugFix clone, hammered in my Configuration.h and testet my usual schedule. Switched the Bed-Temperature-Sensor and wiring another time. No luck.

It only happens while printing. If I set the bed temp to 65, it happily stays around 65°C +/-0.3

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

bamkrs commented 7 years ago

@emartinez167 Sorry to disappoint you. Tried three different wires, 5 thermistors (and still have like 45 to test). Resoldered everything. Funniest thing is, in 1.0.2-2 everything is fine.

emartinez167 commented 7 years ago

I am starting to think @bob-the-Kuhn is right... Have you seen your controller throw up pea soup and turn its head around in circles while speaking in tongues?

Regards,

Ernesto.

On 17 Feb 2017, at 06:33, Benedikt-Alexander Mokroß notifications@github.com wrote:

@emartinez167 Sorry to disappoint you. Tried three different wires, 5 thermistors (and still have like 45 to test). Resoldered everything. Funniest thing is, in 1.0.2-2 everything is fine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

lrpirlet commented 7 years ago

@bhorn. Note: this is a shot in the dark based on quite a few years of computer hardware support... The ANALOG power supply could be noisy, perturbing the DIGITAL logic...

I would have a close look at the power supply while under load... (scope...) The periodicity seen COULD be explained by noise insufficiently filtered in the PS, or by a slightly overloaded PS...

For a test without scope, and if possible, adjust the PS DOWN to 11.8 to 12.0 volt AT THE POWER SUPPLY outlet (not at the bed input). The bed will take forever to heat, note, so use this approach as a test only... Or swap the PS for another with plenty of power available.

Last but not least, forget about this note if the bed is not powered from the same PS as the control logic... Good luck.

thinkyhead commented 7 years ago

The regular period of these spikes is curious. Do you see the same timing of spikes when printing faster or slower?

FHeilmann commented 7 years ago

I have some similar looking outputs:

spikes

With LIN_ADVANCE compiled in (but k set to 0) my prints would abort due to mintemp errors. Without LIN_ADVANCE my prints go through, but the bed temperature graph looks like this.

This is RCBugFix@9b5515926a704bff1edb74772b0800cadf8c323d

I should also mention that I have MINIMUM_STEPPER_PULSE set to 1

lukeskymuh commented 7 years ago

I have similar spikes. I dont know if it is the same cause. grafik check this:https://github.com/MarlinFirmware/Marlin/issues/5698 and: https://github.com/MarlinFirmware/Marlin/pull/5829

LIN_ADVANCED was always disabled.

emartinez167 commented 7 years ago

Interestingly enough, I had a print die on me yesterday due to RUNAWAY...

Are we onto something here?

Regards,

Ernesto

On 23 Mar 2017, at 22:57, lukeskymuh notifications@github.com wrote:

I have similar spikes. I dont know if it is the same cause.

check this:#5698 and: #5829

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Bob-the-Kuhn commented 7 years ago

@emartinez167 - I'm 90% sure you've already done this but ...

Have you played with the following to see if that solves the runaway problem?

  #define THERMAL_PROTECTION_PERIOD 40        // Seconds
  #define THERMAL_PROTECTION_HYSTERESIS 4     // Degrees Celsius

Also, I just saw in another forum where a guy thinks that having bed leveling enabled was the cause of his runaway issues. That would be a really strange interaction.

Bob-the-Kuhn commented 7 years ago

@lukeskymuh - I'm sure that the spike on the bed is not real. The bed's temperature just can't change that fast. The one on the extruder is also suspect. The slopes on the spikes are steeper than during the heat up period.

I'm wondering if this could be a host <-> controller communications problem.

The graph looks like one from Repetier Host. Do any of the other host interfaces also report the spikes?

emartinez167 commented 7 years ago

I did indeed! In fact, I had to do it because I had glued a large sheet of kapton to my bed and it took ages to heat up so I was having the printer shut down. I ended up removing the kapton because I read that the 3M double-sided tape I was using to stick it was the root cause as it is a great insulator!!!

My current problem seems to be related to the fact that I was trying to print at 275C using a Taulman Tritan filament. I managed to get a couple of prints and then I had a RUNAWAY shut down... I double checked thinking it was perhaps a loose cable but found nothing. Then I swapped to ABS and did two successful prints so I suspect I am hitting the E3Dv6 limits...

I am not going any levelling (not even enabled) so not my case.

Any suggestions?

Regards,

Ernesto.

On 24 Mar 2017, at 04:28, Bob-the-Kuhn notifications@github.com wrote:

@emartinez167 - I'm 90% sure you've already done this but ...

Have you played with the following to see if that solves the runaway problem?

define THERMAL_PROTECTION_PERIOD 40 // Seconds

define THERMAL_PROTECTION_HYSTERESIS 4 // Degrees Celsius

Also, I just saw in another forum where a guy thinks that having bed leveling enabled was the cause of his runaway issues. That would be a really strange interaction.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Bob-the-Kuhn commented 7 years ago

I've seen recommendations that a PT100 system be used when printing above ABS temps because thermistors tend to be inaccurate at higher temperatures.

I'm running one. You need to run the amplifier into an analog in that DOESN'T have the pull up resistor on it. The amp is just too wimpy to drive the resistor which results in a reading that's too high by15-20C.

Was the extruder heater on most of the time when running 275C? If yes then maybe the extruder fan is over powering the heater.

emartinez167 commented 7 years ago

Understood. Yes, the fan MUST run all the time or the plastic that holds the fan will melt and stick to the heat dissipation blades on the E3D (been there :(...)

The strange thing is that I managed to print two parts before it started so I am thinking the thermistor must have deteriorated during those two prints.

I have a couple of Stacker hotends I got from their kickstarter campaign that I might use to rebuild a RB regular I have. I think they can take more heat; if that is the case I would use that second RB to print with those exotic materials. Just don't want to spend too much time tinkering with the RB Big now that it is fine tuned.

Regards,

Ernesto.

On 24 Mar 2017, at 06:53, Bob-the-Kuhn notifications@github.com wrote:

I've seen recommendations that a PT100 system be used when printing above ABS temps because thermistors tend to be inaccurate at higher temperatures.

I'm running one. You need to run the amplifier into an analog in that DOESN'T have the pull up resistor on it. The amp is just too wimpy to drive the resistor which results in a reading that's too high by15-20C.

Was the extruder heater on most of the time when running 275C? If yes then maybe the extruder fan is over powering the heater.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Bob-the-Kuhn commented 7 years ago

It's definitely not a com problem. I did some playing with the M105 command and found Repetier Host did the following:

flogger8 commented 7 years ago

I'm having the same issue as original poster in repetier host. Take a look at the screenshot below. I think whats happening (perhaps...could be wrong) is that when repetier host gets the return value line that the cursor is on in the screen shot, its interpretting that "B:6997" as a bed temperature. What is that return line trying to show? I'm guessing position in the first part and then count of something???

image

Bob-the-Kuhn commented 7 years ago

On a CORExx or a delta printer Marlin uses the A & B axis to control the steppers. The X & Y values are translated into A and B values and then the A & B values are used to control the steppers.

What version of Marlin are you running? I hope it's NOT RCBugFix. That's the latest & greatest and what I want you to switch to.

flogger8 commented 7 years ago

Got it. I'm running a corexy so that makes sense.

Running RC8, not RCbugfix. BTW, octoprint has the same issue because it sees the B: as a bed temp coming back even though its not...

Do you think RCBugFix has a fix in for this?

Bob-the-Kuhn commented 7 years ago

I can't guarantee it but there have been efforts made to fix this and I don't see reports of temp spikes from users of RCBugFix in the last 2 months.

The only real nasty is transferring your machine specific settings into the new config files.

FHeilmann commented 7 years ago

I have a CoreXY and octoprint definitely does NOT show that behavior for me! The B: values are not interpreted as temperature values!

Bob-the-Kuhn commented 7 years ago

I couldn't reproduce your issue. I enabled COREXY and then did some moves followed by M114 commands. The Repetier Host temperature graph/log remained steady.

If this was Repetier Host responding to the B: then it would have shown up in the graph/log.

Also, the B: only shows up after a few commands. I expect that the graph in your screen shot has way more spikes in it than it would if this caused by B:

I believe that switching to RCBugFix will remove these spikes.

I've never played with Octoprint.

flogger8 commented 7 years ago

Ok - i'll try RCBugFix when I get a chance between prints to configure and upload it. I've been able to map the 'spikes' to exactly these recv messages. The log/chart went from 51.3 up to 2173 and then back down...and it appears that they happen repeatably after a G92 command which seems to be very common in my gcode where it keeps resetting the extrude length. I use S3D as my slicer.

If I look at one of my older cura sliced files, I have 6-10 G92 commands. If I look at an S3D sliced file, it has 1500-2000 G92 commands to zero out the extruded length

Recv: ok
Send: N1660 G1 E-3.5000 F3600*42
Recv: ok
Send: N1661 M105*23
Recv: ok T:200.0 /200.0 B:51.3 /50.0 @:58 B@:0   <-------
Send: N1662 G1 X146.084 Y120.968 F9000*126
Recv: ok
Send: N1663 G1 E0.0000 F3600*2
Recv: ok
Send: N1664 G92 E0*114     <-----------
Recv: X:146.08 Y:120.97 Z:0.18 E:0.00 Count A: 26658 B:2173 Z:72   <---------

Another example from right after the G92 again

Send: N1676 G92 E0*113
Recv: X:148.14 Y:121.67 Z:0.18 E:0.00 Count A: 26712 B:2376 Z:72
thinkyhead commented 7 years ago

Since G92 changes the current position, and the host may not always catch this, G92 concludes with a position report. My guess is that Repetier Host (and maybe some others) are unfamiliar with the Count: suffix added to core and other kinematic systems, and are liberally interpreting it as bed temperature, even though the rest of the line refutes that context.

So this looks like something @repetier may want to comment upon…

repetier commented 7 years ago

For this reason I had to remove position update from reported coordinates. After G92 it seems to return now coordinates corrected by G92 which confuses the host which also adds them. This must be a recent change. So now host only relies on what you send. Host always shows real position not G92 corrected also it knows about the G92 offset.

B: is the response for bed temperature already. Expressions search for this. Since we do not know what the context of an answer is, these should always be unique. I'm quite sure host will detect this as bed temperature as well. Why do you not make it unique by using Count_A: Count_B: also Z: is twice here which would make parsing difficult. Which Z: is correct since both have different meanings but same ID.

github-actions[bot] commented 2 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.