MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
16.11k stars 19.2k forks source link

HB not heating during G29 #5698

Closed lukeskymuh closed 6 years ago

lukeskymuh commented 7 years ago

I tried to perform an G29 with a 10x10 matrix with the HB at 70°C to measure the distortion of my HB over temperature. After some pints I get an "ERR: MINTEMP BED". I asume that the HB thermal control is inactive during G29. Is this by purpose or a bug? (Is there a work around?)

lukeskymuh commented 7 years ago

RC8 Bug fix 1.1.0

ghost commented 7 years ago

I would check your thermistor wiring. It is likely broken somewhere. I'll bet, for the short term, that it doesn't throw that error at the same point every time.

chorca commented 7 years ago

I'm getting the same issue on my printer. Randomly during printing I get mintemp errors. I set up my oscilloscope on the pin with a 2Gs/s capture rate and triggering at 4.65v, and didn't see a blip during the MINTEMP error. Didn't have this issue with RC6, and even rewired my thermistor. Seems it may be in software.

ghost commented 7 years ago

Hmmm. Well, I have a Prusa i3, which has a ramps 1.4 on an Arduino Mega 2560. I've had that issue a few times and it was a broken thermistor wire in both cases. I have run RC6, RC7, RC8 and RCBugFix. No issues with the firmware for the board I have.

So...IDK.

Blue-Marlin commented 7 years ago

ERR: MINTEMP BED is thrown in the temperature interrupt. A relation to G29 is unlikely - except G29 is moving the bed and the wires connected to it. Even if during G29 the bed would not be heated, it seems to be unlikely your bed could cool down suddenly below MINTEMP - except you are printing in an environment far below mintemp - but then its unlikely you could start heating the bed at all. What kind of levelling do you use with how many points to measure? If out of RAM every error is possible. ERR: MINTEMP BED is ought to detect broken thermistor wires. I'd take that serious.

ghost commented 7 years ago

Swap the hotend and bed thermistor plugs at the board. See if you get a MINTEMP instead of MINTEMP BED error.

lukeskymuh commented 7 years ago

My thermistor is working correctly and the temperature is appearing on the display. I also don't see the heat LED lighting up when performing G29. So I really think that when performing auto-level the heat bed control is stopped. Another user has just posted the same issue: https://github.com/MarlinFirmware/Marlin/issues/5702

lukeskymuh commented 7 years ago

Blue-Marlin I have used 10x10 points so it takes a long while. to get a profile like this attached. flattness of aluminium hb for more details see: https://3dprint.wiki/reprap/anet/a8/improvement/autobedleveling#sensor_and_sensor_support

lukeskymuh commented 7 years ago

If I disable the HB before the G29 everything works but the HB drops fom 75° to below 50°C.

ghost commented 7 years ago

I normally don't heat anything before a G29 and everything works fine. My printer sits in a room that drops the temps as cold as 4°C and it still works.

But, I'll heat my bed to 50°C (my normal running temp for PLA) and see what happens.

lukeskymuh commented 7 years ago

I checked it again with the current BugFix (Downloaded 22.1.2017). I have to correct: The HB switches on, so the HB is heating. But when setting the HB to 60°C and starting G29 (3x3) after the 3rd point I get the following error:

00:22:25.417 : G29 Auto Bed Leveling 00:22:36.284 : Error:MINTEMP triggered, system stopped! Heater_ID: bed 00:22:36.292 : Error:Printer halted. kill() called!

It should be easily be reproduced. Can anyone test it on another printer?

lukeskymuh commented 7 years ago

Up to now I know that that the error is triggered because current_temperature_bed_raw(28137, 31846) i is above bed_minttemp_raw(16079)

lukeskymuh commented 7 years ago

I deactiveted the the trigered alarm (see zip file). Now it is working. I see that there are random temperature spikes (see screenshot). Due to some reason "current_temperature_bed_raw" has random values sometimes. A work around would be to extend MAX_CONSECUTIVE_LOW_TEMPERATURE_ERROR_ALLOWED to the HB temperature. But is would be nice to solve the problem. Thanks in advance.

The main question is: Is it my hardware(Anet A8) or the software. If someone could reproduce the error or show that it works on another printer that would be great.

The temperature.cpp file attached with deativated alarms is for test and debug only. I do not recomend deactivating the temperature alarms.

temperature.zip

screenovertemp

chorca commented 7 years ago

My issue was similar, though while I was printing instead of while I was running a G29. I performed analysis of the pin and while I could not see any voltage change with my oscilloscope, the A/D value jumped up which seemed to indicate the temperature dropping very low suddenly. I haven't been able to resolve it and have switched to Smoothie for the time being.

JohnOCFII commented 7 years ago

If folks are getting around this by disabling the thermal protections -- I wonder if the thermal protection logic is faulty, or if that many of us are running faulty printers. I noticed that the older RC3 version I was running had the thermal protection commented out. Mine is dying either during G29 or after the print starts. I did a few prints without G29 on RC8 (non-bugfix) that worked fine. I'm not sure what my issue is. Mine looks like extruder 1 (my active extruder) having the issue: READ: Error: Heating falied, system stopped! Heater_ID: 1 READ: Error: Printer halted. kill() called!

schustercp commented 7 years ago

I have recently upgraded 2 printers to RC8 and both printers are exhibiting the same issue. To reproduce I use pronterface to set the bed temp to 110 and then click the home button while the bed is heating up. The AirWolf 3D homes all axes and then displays the error and stops every thing. The PrusaI3 Homes X and Y and then Moves the Z axis to deploy the Z probe using the servo, then the mintemp error is shown and stops everything. If I let the bed heat to temp first the error does not get shown on either printer. If I roll back to RC6 then it works correctly again. I have not updated my Delta due to this issue.

ghost commented 7 years ago

Up to now I know that that the error is triggered because current_temperature_bed_raw(28137, 31846) i is above bed_minttemp_raw(16079)

The bed temp is supposed to be above the mintemp, if not, the mintemp error will stop the printer.

Anyone consider faulty thermistors? or faulty thermistor wiring?

chorca commented 7 years ago

It would be surprising if suddenly all these thermistors started having issues after the update, but not impossible. I hooked an oscilloscope to mine but couldn't see any change in the voltage when the MINTEMP was hit.

lukeskymuh commented 7 years ago

Anything is possible, the question what is likely:

Any other sugestions? Or input?

ghost commented 7 years ago

I have recently upgraded 2 printers to RC8 and both printers are exhibiting the same issue. To reproduce I use pronterface to set the bed temp to 110 and then click the home button while the bed is heating up. The AirWolf 3D homes all axes and then displays the error and stops every thing. The PrusaI3 Homes X and Y and then Moves the Z axis to deploy the Z probe using the servo, then the mintemp error is shown and stops everything. If I let the bed heat to temp first the error does not get shown on either printer. If I roll back to RC6 then it works correctly again. I have not updated my Delta due to this issue.

I cannot reproduce this issue on my Prusa i3 running the latest RCBugFix + some other unrelated PR's.

ghost commented 7 years ago

Although I do not use ABL anymore, I did try heating the bed to 110C and while it was heating, initiated a G29. Points were accepted with no issues. I did not have any mintemp issues. All is ok for me.

With that, how can anyone say that this issue is a firmware issue?

What I CAN say is a firmware issue, is that when I was on my 9th probing point, I issued a M84 then a G28. The steppers turned off then everything homed. But, my LCD was still on the screen for probing the 9th point. I had to reset the printer to get it out of the MBL mode.

ghost commented 7 years ago

Anyone using ABL....try the devel-ubl branch and see if it does it with that version.

ghost commented 7 years ago

It would be surprising if suddenly all these thermistors started having issues after the update, but not impossible. I hooked an oscilloscope to mine but couldn't see any change in the voltage when the MINTEMP was hit.

All 5 thermistors started having issues after the update? What about the HUNDREDS running the latest update that don't?

BTW, I am using a type 6 for the hotend and type 1 for the bed. Is it possible that the wrong thermistor is being set?

chorca commented 7 years ago

All 5 thermistors started having issues after the update? What about the HUNDREDS running the latest update that don't?

I'm just saying it's possible that it's my board. I did everything I could to determine that it wasn't my wiring, hooking a 2GS/s oscilloscope up to it and triggering on a voltage change, but even when triggered, I'd validated that that line on the IC was still receiving the correct voltage. Unless my ATMEGA chip just decided to up and die/become intermittent when I loaded RC8, i'm not sure where else to look.

The fact that there's more than one or two people having this issue seems to indicate we have some common issue.

ghost commented 7 years ago

@chorca: I have seen weird stuff that cannot be explained, in my days. Change the thermistor and see if that changes things.

schustercp commented 7 years ago

I put

      SERIAL_ECHO("min: ");
      SERIAL_ECHO(bed_minttemp_raw);
      SERIAL_ECHO(" Raw: ");
      SERIAL_ECHOLN(current_temperature_bed_raw);

above the line for the min temp test

      if (bed_minttemp_raw GEBED current_temperature_bed_raw && target_temperature_bed > 0.0f)      min_temp_error(-1);

In the Temperature::isr

I know serial port prints here are changing timing but I believe the HOME Function is either calling the ISR or causing the interrupt to signal too often.

In the result below you can see where the home function causes the glitch. If the bed is heating when this glitch happens too many ADC results get accumulated in the "current_temperature_bed_raw" and then the min test traps the error. (In the test below the bed is not heating so this is the readout of ambient temp and when the glitch happens the bed is not moving.)

min: 16063 Raw: 15600
min: 16063 Raw: 15601
min: 16063 Raw: 15599
min: 16063 Raw: 15601
min: 16063 Raw: 15600
min: 16063 Raw: 15597
min: 16063 Raw: 15601
min: 16063 Raw: 15599
min: 16063 Raw: 15601
min: 16063 Raw: 15601
min: 16063 Raw: 15601
min: 16063 Raw: 15601
echo:busy: processing
min: 16063 Raw: 15601
min: 16063 Raw: 15602
min: 16063 Raw: 15595
min: 16063 Raw: 15600
min: 16063 Raw: 15599
min: 16063 Raw: 15600
min: 16063 Raw: 15600
min: 16063 Raw: 15599
min: 16063 Raw: 15599
min: 16063 Raw: 15599
min: 16063 Raw: 15605
echo:busy: processing
min: 16063 Raw: 15601
min: 16063 Raw: min: 16063 Raw: min: min: 16063 Raw: min: 16063 Raw: 27304
27304
16063 Raw: 27304
27304
27304
min: 16063 Raw: 15598
min: 16063 Raw: 15598
min: 16063 Raw: 15600
echo:busy: processing
min: 16063 Raw: 15600
min: 16063 Raw: 15599
Blue-Marlin commented 7 years ago

Looks as if the temperature ISR is biting its tail. Try

 ISR(TIMER0_COMPB_vect) { Temperature::isr(); }

 void Temperature::isr() {
   //Allow UART and stepper ISRs
-  CBI(TIMSK0, OCIE0B); //Disable Temperature ISR
-  sei();
+  //CBI(TIMSK0, OCIE0B); //Disable Temperature ISR
+  //sei();

   static uint8_t temp_count = 0;
   static TempState temp_state = StartupDelay;
   static uint8_t pwm_count = _BV(SOFT_PWM_SCALE);

(Serial output during an interrupt alters the interrupt timing. So the result we see here may be caused by the debug-output you added. On the other hand i warned about reentering an interrupt before and asked for extra protection to protect from that.)

Blue-Marlin commented 7 years ago

Alternatively you could try to deactivate ADVANCE or LIN_ADVANCE.

In the advance_isr_scheduler()

    // Restore original ISR settings
    cli();
    SBI(TIMSK0, OCIE0B);
    ENABLE_STEPPER_DRIVER_INTERRUPT();

while the temperature interrupt is running`, is NOT restoring the original ISR settings, but alters them.

Blue-Marlin commented 7 years ago

To fool this

@@ -1486,11 +1486,15 @@ void Temperature::set_current_temp_raw() {
  *  - Check new temperature values for MIN/MAX errors
  *  - Step the babysteps value for each axis towards 0
  */
 ISR(TIMER0_COMPB_vect) { Temperature::isr(); }

+bool in_temp_isr = false;
+
 void Temperature::isr() {
+  if (in_temp_isr) return;
+  in_temp_isr = true;
   //Allow UART and stepper ISRs
   CBI(TIMSK0, OCIE0B); //Disable Temperature ISR
   sei();

   static uint8_t temp_count = 0;
@@ -1942,7 +1946,9 @@ void Temperature::isr() {
       endstop_monitor_count &= 0x7F;
       if (!endstop_monitor_count) endstop_monitor();  // report changes in endstop status
     }
   #endif

+  in_temp_isr = false;
   SBI(TIMSK0, OCIE0B); //re-enable Temperature ISR
+
 }

should be a bit more difficult.

lukeskymuh commented 7 years ago

Wow, thank you schustercp and Blue Marlin. I didn't understand everything, but as far I understood this is caused by a interrupt interference. Is this a work aroud or a solution for the next RC?

Blue-Marlin commented 7 years ago

1 and two are things to experiment with. If that works, 3 is possibly one of several solutions to your problem. But as said before. @schustercp 's test code itself may be the reason for the failure it detects.

FHeilmann commented 7 years ago

Just chiming in, I had a Mintemp halt today on a bed that never had any issues before. I occasionally see spikes in temperature readings which are otherwise rock-solid. I use Bilinear leveling, and just compiled LIN_Advance in during my last flash, however I did not use its "effects" (K=0).

brainscan commented 7 years ago

I've had mintemp errors also on hardware that's always been fine, it seems to happen if the bed is at or below 17°C when I hit print, the LCD briefly shows the temp drop to below 5° which triggers the mintemp error. I need to load my previous version to see if it still happens.

ghost commented 7 years ago

I've started prints @ 3 degs and no issue. I dropped my mintemp to -10 because I didn't want to have the mintemp error if I started it too cold. ATM, my printer is 15 degs. I know I can start it up with no issues.

ghost commented 7 years ago

LCD briefly shows the temp drop to below 5° which triggers the mintemp

This tells me that you have a broken thermistor wire or a bad/weak connection.

FHeilmann commented 7 years ago

So i recompiled the RCBugFix branch I had, and did nothing but switch off LIN_ADVANCE, and the print that previously failed consistently with MINTEMP errors went through just fine.

Seems like @Blue-Marlin is on to something

thinkyhead commented 7 years ago

Could it be that LIN_ADVANCE is eating up too much CPU and preventing the temperature ISR from getting readings?

CC: @Sebastianv650

Sebastianv650 commented 7 years ago

I read through the issue, I think we should seperate two possible causes. In more than the first half of the issue, there is no mentioning of an active advance so I guess it wasn't enabled from the persons reporting there. If I'm wrong, please raise a hand.

Therefore I see maybe two issues:

But I can think of another scenario where the raw value might get garbled: The temperature ISR might become delayed due to serial events inside the stepper ISR or also due to stepper and serial events inside the temperature ISR that much so that the time frame between the first (delayed) loop where an ADC conversion is started and the next (maybe not delayed) loop where the result is captured gets too short. Somebody has in mind how much time "safety" Marlin has at the moment between two ADC conversions vs the time it needs to do one conversion? On a first look, Marlin isn't checking if the ADC conversion is finished when grabing the raw value. Would that be reasonable?

Sebastianv650 commented 7 years ago

Just claimed I'm printing without errors since weeks - just guess what happened to my recent print: Mintemp error.

And I have to say there is in fact a way to re-enter the temperature ISR. I wasn't seeing it before. If Marlin is inside the temperature ISR, the stepper ISR is enabled. If a stepper event is now happening Marlin will proceed with the stepper ISR. Now, at the end of the stepper ISR, the temperatre ISR gets enabled again. While Marlin proceed the rest of the temperature ISR, it's now vulnerable to a second ISR call. In the opposite direction, re-entry of the stepper ISR, shouldn't be possible as we are not re-enabling this ISR inside another ISR.

I did the changes @Blue-Marlin wrote above to my local Marlin copy and print the part a second time now. I'm sorry, if that's the cause it's my fault for all the temperature errors and maybe also freezing issues!

Blue-Marlin commented 7 years ago

@Sebastianv650 If we have the information if we are in the temp/interupt (boolt in_temp_isr) we can restore the correct state in the stepper interrupt. https://github.com/AnHardt/Marlin/pull/74

Sebastianv650 commented 7 years ago

I see you already have an even better version. We only have to keep an eye on all the ISR re-enable sections. There is one inside the ISRhandler with advance enabled, but also 2 or 3 inside the stepper ISR if advance is disabled. And I would like to add another cli() at the end of the temperature ISR before re-enabling the temperature ISR again. Do you want to update your PR or should I implement your improvements into the one I just created?

Edit: Just recognised it's @AnHardt PR. I will update my PR just to have the changes in one place, but feel free to close it and use AnHardts if you want.

Blue-Marlin commented 7 years ago

Got some calls keeping me busy at least the next 12 hours. So please do it yourself - you got the idea.

thinkyhead commented 7 years ago

This may be fixed by #5829

github-actions[bot] commented 2 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.