MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
16.28k stars 19.24k forks source link

[bugfix-1.1.x] axis homing halts #11224

Closed Murloc992 closed 6 years ago

Murloc992 commented 6 years ago

Description

I am trying to setup marlin bugfix-1.1.x on a Trigorilla 1.3 (not 32bit) board.

My X-Axis doesn't want to home no matter what I change. Endstop triggers are very fine. It feels like X axis somehow gets limited on the "MIN" side.

Other axis seems to be fine.

Before I disabled endstop validation printer would just halt itself on X homing.

Steps to Reproduce

  1. Manually move X axis to the maximum possible
  2. Send a G28 or just G28 X0
  3. Observe as X axis does a "ghost" endstop movement in a random spot of the axis, endstop is OPEN
  4. Send multiple G28 (or X0) commands till it homes properly
  5. Endstop is now TRIGGERED

Expected behavior: It should home on X axis without ghost endstops.

Actual behavior: What I mentioned in the steps part.

Additional Information

This includes configuration.h configuration_adv.h and pins_trigorilla_13.h: Marlin.zip Video: https://www.youtube.com/watch?v=jAUTN1cb_n8

GMagician commented 6 years ago

Maybe you have unfiltered input and some electrical noise will send fake results. You may try to solve enabling ENDSTOP_NOISE_FILTER in configuration.h. If it solve you are in such situation and your hardware is weak to electrical noise. Solution is proposed directly in cnf file just above option. If you can't solve in hardware you must live with this option and all negative effects it has.

Don't know which endstop you have, if simple switch, you can try to connect com to "signal", NC to +5V and NO to GND, enable ENDSTOPPULLUP_XM??, but this depends on your knowledges. All these HW operations are intended to be done only if you know what you are doing, you may destroy your hardware.

EDIT: with last suggestion you also have to set to false als X_M??_ENDSTOP_INVERTING

Murloc992 commented 6 years ago

My endstops are basic switches. I will try this option. The thing is, other endstops are the same for Y and dual Z endstops and they work perfectly fine.

This is how they look like: 2018-07-08 21 27 40

Pins are soldered, from the point of reference they are C NO NC. Edit1:They are infact soldered as you are saying me to do. I will see what pullups do. Edit2:They are enabled in my configuration..

There's also a LED connected to COM with a resistor.

Murloc992 commented 6 years ago

The noise filter has helped I think. Yet I don't understand what has changed in bugfix-1.1.x to have caused my X endstop to not work anymore?

When I flash older firmware I have (https://github.com/derhopp/Marlin-with-Anycubic-i3-Mega-TFT/tree/bugfix-1.1.x) it works without a budge with identical configuration file.

When I flash newer firmware back it does "ghost" endstop only on X...

GMagician commented 6 years ago

New firmware have changed how endstop detection works. Now is less immune to noise but was done this way to have a more precise and repetitive home position. Looking at image I think I see a led and a resistor (used to limit led current). No capacitor to filter noise.

Maybe X is noisy because of where cables run along your printer.

Murloc992 commented 6 years ago

So basically only way to solve this would be to add 100nF capacitor to the endstop?

GMagician commented 6 years ago

For sure capacitor will filter noise. I think that your board doesn't have any (trigorilla I think is a ramps derived board).

Missing capacitor is a hardware design fault. If option has solved issue that capacitor will do the same (in a better way)

GMagician commented 6 years ago

Note that you have inverting endstops to true so I think your board will close to 5V when micro is pressed. My solution was to close to GND

Murloc992 commented 6 years ago

My endstop wiring is like this now, looking at the PCB: S goes to COM G goes to NO V goes through the LED and COM

GMagician commented 6 years ago

Don't know how led is connected but maybe it's not good to change connection

Murloc992 commented 6 years ago

LED is connected between V and S. Would a 10uF capacitor do for a test or is it too much?

GMagician commented 6 years ago

ok, so it's really as I told you.. when micro is pressed it close to GND and led will turn on.

I'm not smart in electronic but I think the bigger the capacitor is and stronger the filter will be. Too strong is not good, too few wil not filter enough. There is a formula to find out filtered frequency.

Murloc992 commented 6 years ago

I looked into the wiring. X is the only one connected differently from the others.. It doesn't have the LED hooked up to anything. I bet it's one of the culprits. I might just try to guide a "hack" wire and adjust the connector to make it work like the others.

Murloc992 commented 6 years ago

I just added a 10uF capacitor for a test and it works. So if somebody could close this it would be nice. Huge thanks @GMagician !

GMagician commented 6 years ago

You may close it by pressing butto below

Murloc992 commented 6 years ago

Woops :)

Murloc992 commented 6 years ago

No, adding 100nF capacitors does not fix the issue. It still halts.

I know the soldering job looks a bit shoddy, but multimeter says everything is as it should be.

2018-07-10 18 21 35

Murloc992 commented 6 years ago

In fact it's only dual Z endstops that fail. They do hit once, then slowly twice and printer halts itself.

GMagician commented 6 years ago

Ok, just to be sure, now what happens? home should be X, Y and then Z. Before was X failing now is Z that fails?

Murloc992 commented 6 years ago

X homes, hits endstop twice, this is fine. Y homes, hits endstop twice too, also fine. Z homes, hits endstops, rises a bit, hits endstops again slowly. This is normal behavior with old firmware but this time it just fails when it triggers second time.

I tried to "delevel" the Z axis to see what happens, but it levels and homes perfectly, just that it halts.

M119 shows correct number of endstops(X, Y, Z, Z2, Probe) and all function properly.

I am at a loss.

GMagician commented 6 years ago

What do you means with

hits endstops again slowly. This is normal behavior with old firmware but this time it just fails when it triggers second time.

does/doesn't it stops? or just it doesn't rise again?

Do you still have filter option enabled?

Edit: also you may try to compile with DEBUG_LEVELING_FEATURE option and before homing call M111 S255 this way we well have a debug info to analize

Murloc992 commented 6 years ago

Filter is disabled. It homes Z normally, it hits one time normal speed, one time slow speed and stops. After it stops it halts.

GMagician commented 6 years ago

Ok let's try with debug option and M111

Murloc992 commented 6 years ago

M111 S38 SENDING:M111 S38 echo:DEBUG:INFO,ERRORS echo:busy: processing <<< homing Z happens here Error:Printer halted. kill() called! [ERROR] Error:Printer halted. kill() called!

This is all I get.

GMagician commented 6 years ago

s255 is more complete

Murloc992 commented 6 years ago

It printed me probably whole memory map of the board and then homed Z and halted. No extra info.

All I know is that it fails on endstops.cpp line 233 Endstops::validate_homing_move method.

yagona commented 6 years ago

I also have a modified ANYCUBIC i3 Mega. (I replaced, e.g., all stepper drivers with TMC2208.) As long as I used derhopp's fork of Marlin 1.1.8 or bugfix-1.1.x everything was fine. Recently I merged derhopps's bugfix-1.1.x with the latest Marlin bugfix-1.1.x causing exactly the same problem with z homing as described by Murloc992. I also get the kill message as stated above. However, somehow I could get M111 S38 working with the following result:

LESEN: TFT Serial Command: A21 Z LESEN: echo:enqueueing "G28 Z" enqueueing "G28 Z" LESEN: >>> G28 LESEN: Machine Type: Cartesian LESEN: Probe: NONE LESEN: current_position=(-5.00, 0.00, 5.00) : setup_for_endstop_or_probe_move LESEN: > endstops.enable(true) LESEN: >>> homeaxis(Z) LESEN: Home 1 Fast: LESEN: >>> do_homing_move(Z, -307.50, [4.00]) LESEN: current_position=(-5.00, 0.00, 0.00) : sync_plan_position ... LESEN: <<< do_homing_move(Z) LESEN: Move Away: LESEN: >>> do_homing_move(Z, 2.00, [4.00]) LESEN: current_position=(-5.00, 0.00, 0.00) : sync_plan_position LESEN: echo:busy: processing busy: processing LESEN: <<< do_homing_move(Z) LESEN: Home 2 Slow: LESEN: >>> do_homing_move(Z, -4.00, 1.00) LESEN: current_position=(-5.00, 0.00, 0.00) : sync_plan_position ... LESEN: <<< do_homing_move(Z) LESEN: >>> do_homing_move(Z, 0.00, [4.00]) LESEN: current_position=(-5.00, 0.00, 0.00) : sync_plan_position LESEN: Error:Printer halted. kill() called! LESEN: TFT Serial Debug: Kill command... J11

GMagician commented 6 years ago

@yagona in your configuration do you have Z_DUAL_ENDSTOPS_ADJUSTMENT equal to 0? @yagona and @Murloc992 try to set a different value for such parameter.

@thinkyhead and @ejtagle I think that in some situation like this, after home is done, hit_on_purpose will clear hit_state variable and on dual stepper systems, with no offset between 2 steppers, do_homing_move will wrongly test at end if micro are hits and they are not (flag cleared before and not enough time elapsed to set them again in temperature isr). This will kill the system.

GMagician commented 6 years ago

@yagona and @Murloc992 you can try also adding endstops.update call just before endstops.validate_homing_move call in do_homing_move function. This to temporary patch issue

Murloc992 commented 6 years ago

@GMagician Z_DUAL_ENDSTOPS_ADJUSTMENT is 0. I have tried to add endstops.update() call before line 3020 in marlin_main but nothing really changed.

What should I try with the adjustment? Positive or negative values?

yagona commented 6 years ago

@GMagician Yes, I had Z_DUAL_ENDSTOPS_ADJUSTMENT at 0. Now, I changed to 0.025 by "M666 Z0.025". This seems to solve the issue. Output is now:

LESEN: TFT Serial Command: A21 Z LESEN: echo:enqueueing "G28 Z" enqueueing "G28 Z" LESEN: >>> G28 LESEN: Machine Type: Cartesian LESEN: Probe: NONE LESEN: current_position=(0.00, 0.00, 30.00) : setup_for_endstop_or_probe_move LESEN: > endstops.enable(true) LESEN: >>> homeaxis(Z) LESEN: Home 1 Fast: LESEN: >>> do_homing_move(Z, -307.50, [4.00]) LESEN: current_position=(0.00, 0.00, 0.00) : sync_plan_position ... LESEN: <<< do_homing_move(Z) LESEN: Move Away: LESEN: >>> do_homing_move(Z, 2.00, [4.00]) LESEN: current_position=(0.00, 0.00, 0.00) : sync_plan_position ... LESEN: <<< do_homing_move(Z) LESEN: Home 2 Slow: LESEN: >>> do_homing_move(Z, -4.00, 1.00) LESEN: current_position=(0.00, 0.00, 0.00) : sync_plan_position ... LESEN: <<< do_homing_move(Z) LESEN: >>> do_homing_move(Z, 0.03, [4.00]) LESEN: current_position=(0.00, 0.00, 0.00) : sync_plan_position LESEN: <<< do_homing_move(Z) LESEN: >>> set_axis_is_at_home(Z) LESEN: For Z axis: LESEN: home_offset = 0.00 LESEN: position_shift = 0.00 LESEN: soft_endstop_min = 0.00 LESEN: soft_endstop_max = 205.00 LESEN: > home_offset[Z] = 0.00 LESEN: current_position=(0.00, 0.00, 0.00) : LESEN: <<< set_axis_is_at_home(Z) LESEN: current_position=(0.00, 0.00, 0.00) : sync_plan_position LESEN: current_position=(0.00, 0.00, 0.00) : > AFTER set_axis_is_at_home LESEN: <<< homeaxis(Z) LESEN: current_position=(0.00, 0.00, 0.00) : sync_plan_position LESEN: current_position=(0.00, 0.00, 0.00) : clean_up_after_endstop_or_probe_move LESEN: X:0.00 Y:0.00 Z:0.00 E:0.00 Count X:0 Y:0 Z:0 LESEN: <<< G28

Murloc992 commented 6 years ago

@GMagician @yagona interestingly only M666 worked. This solves the issue I guess? Kind of.

GMagician commented 6 years ago

@Murloc992 value should be offset to align two steppers. But I'm wondering if micro is pressed when such command is executed

GMagician commented 6 years ago

@Murloc992 this is only a temporary patch. it's unusual but M666 Z0 could be a possible value

Murloc992 commented 6 years ago

@GMagician both switches get pressed. I will look at this and see if any solution is proposed. :)

GMagician commented 6 years ago

theoretically changing Z_DUAL_ENDSTOPS_ADJUSTMENT will move an axis and this will let endstops.update to be called. calling it before endstops.validate_homing_move should temporary fix as well, but it seems not. Maybe something is escaping from my check

yagona commented 6 years ago

@GMagician I did some further testing according to your suggestions. The results are:

GMagician commented 6 years ago

ok, thanks, for the moment you have a solution (I hope you really need z adjustment to align Z1 with Z2). Then I hope someone with more knowledge will try to solve this

yagona commented 6 years ago

@GMagician Thanks a lot for your help! Your suggestions were very useful to identify the probable cause of this issue.

I'd like to add that "M666 Z0.001" also suppresses the issue. This should be well below the resolution of the microswitches.

Murloc992 commented 6 years ago

M666 Z0.001 saves itself as Z0.00. It works fine. Maybe it's related to this "floating maths" fix?

GMagician commented 6 years ago

Don't know but for sure also 0 should work. There is something somewhere that doesn't detect endstop status correctly and this will raise error. Changing value will put a movement in planner and there endstop are updated. Don't know more, maybe I'll try to investigate deeper but this goes beyond my limits

GMagician commented 6 years ago

@Murloc992 found something. When stepper are not moving Endstops.update doesn't update anything so basically doesn't workl

ejtagle commented 6 years ago

As stated by @GMagician , M666 schedules a new movement to the queue. I don´t have a Delta, so i can´t test this, but, does M666 Z0 change the homing sequence at all ?

Endstops are updated 1000 times a second in that configuration (because there is no interrupt on endstop feature enabled) - So adding an extra movement should not make a difference

Your description of the issue is a little bit vague: Do you mean the printer halts and needs a reset to resume operation ? (a kill operation ?)

GMagician commented 6 years ago

@ejtagle kill means it completely stop working with message "printer halted". When movement is queued an endstops.update si called in planner and this will set hit_state (micro are still pressed). After do_move hit_state is tested and if 0 kill

GMagician commented 6 years ago

@ejtagle this is sequence that lead to issues:

As told before by changing Z value with M666 if will put something into planner and since planner call endstops.update here 'hit_state' is updated and set != 0. Once endstops.validate_homing_move is called it doesn't fail. I think that this issues is also present for all ?_DUAL when offset is 0

ejtagle commented 6 years ago

The only way this could happen is if between the call to endstops.hit_on_purpose() and the call of endstops.validate_homing_move() less than 1 millisecond (and thus, not temperature ISR that updates the endstops) happen. Adding a safedelay() call of about 5 milliseconds insde validate_homing_move() should fix this issue then. But not when using ENDSTOPS_INTERRUPTS ... As the endstop state is only updated on changes, and no changes are expected if no movements are done. I maybe the logic is somehow flawed, as originally validate_homing_move() should only be called if there was a move!

GMagician commented 6 years ago

The only way this could happen is if between the call to endstops.hit_on_purpose() and the call of endstops.validate_homing_move()

I think this may be highly probable since no movement are involved, code may take less than 1ms to reach failing call

I maybe the logic is somehow flawed, as originally validate_homing_move() should only be called if there was a move!

const bool is_home_dir = (axis_home_dir > 0) == (distance > 0)

when distance is 0 may be also true and this will call failing function... it should be enough to not set this variable and everything will be ok... In such situation maybe also bl-bouch could be lowerd and this is not good

ejtagle commented 6 years ago

The planner can optatively not schedule a move... :

  // Bail if this is a zero-length block
  if (block->step_event_count < MIN_STEPS_PER_SEGMENT) return false;

Also, the call to stepper.quick_stop() can cause some problems... as that call just sets a flag to abort on the next step ISR. So, a small delay (1mS) could be needed.

GMagician commented 6 years ago

Also, the call to stepper.quick_stop() can cause some problems... as that call just sets a flag to abort on the next step ISR. So, a small delay (1mS) could be needed.

or not to call endstops.validate_homing_move at all (as you said). Note that as stated above, calling do_homing_move with distance = 0 will also deploy probe and this is not good. In my opinion to solve this should be enough:

const bool is_home_dir = (axis_home_dir > 0) == (distance > 0) && distance

this will solve interrupt case, distance = 0 case and all other

Edit: merging another pull request of mine bl-touch is already fixed even without fixing this

ejtagle commented 6 years ago

@GMagician : I remember when @thinkyhead added the validate_homing_move() call. It was to prevent the issue that when an endstop was not triggered, the homing procedure continued and that was a dangerous behaviour for deltas. If it was for me, i´d add an extra parameter fo the function, but, also changing the condition is fine. Please, remember that distance is a float, so comparing with 0 will usually lead to errors.

I´d add the check at the start of the function that ensures if distance is below a certain limit, either positive or negative, just do nothing:

if (UNEAR_ZERO(ABS(distance))) return;
GMagician commented 6 years ago

I remember when @thinkyhead added the validate_homing_move() call. It was to prevent the issue that when an endstop was not triggered,

thats ok, but this code raise a "printer kill" and is not also good.

distance is a float, so comparing with 0 will usually lead to errors.

That's true when value is derived from computation, 0 is a well defined value for FP IEEE so if it comes from a #define then match will always succeed

Also test you made may lead to error since it should test something abs "below" MIN_STEPS_PER_SEGMENT converted to axis unit