MarlinFirmware / Marlin

Marlin is an optimized firmware for RepRap 3D printers based on the Arduino platform. Many commercial 3D printers come with Marlin installed. Check with your vendor if you need source code for your specific machine.
https://marlinfw.org
GNU General Public License v3.0
15.97k stars 19.09k forks source link

[bugfix-2.0.x] G28 Or G28 X crashes X axis if the end-stop is in a triggerd state when homing #12366

Closed GhostlyCrowd closed 5 years ago

GhostlyCrowd commented 5 years ago

Description

Steps to Reproduce

G28 Or G28 X crashes X axis if the end-stop is in a triggerd state when homing

  1. Home the X axis, with G28, for example my print end script does this to move the tool away from the print.

  2. Send a G28 - I do this before every print.

Expected behavior: The Xaxis should detect the X endstop is in a triggerd state already and bump the end-stop, then Y then Z should probe.

Actual behavior: The X endstop does not bump it just tries to move the axis continuously into the already triggered endstop untill you send an emergency stop.

Additional Information

This is the third endstop I've tried. The endstop always works as long as it does not start homing while its in a triggered state. This is 100% reproduce-able every time.

HOWEVER If you shut the printer off while the X endstop is triggered, turn it back on and Home. It works as it should. This only happens after its homed at least once already.

Below is a video Showing it.

Please ignor the slight skipped steps on the X, Ive reduced the current dramatically to the bare minimum to move the axis so i dont chew up my belts showing this issue.

The video shows me homing the printer with G28. Then Homing the X axis with G28 X which parks the axis on the endstop leaving it in a triggered state. I then send G28 to show that instead of detecting its triggering and bumping the endstop it drives it into the end stop crashing the axis until I send M112.

https://youtu.be/4oZtjUbzFqA

thinkyhead commented 5 years ago

Are you able to narrow down where in the code the error lies?

GhostlyCrowd commented 5 years ago

I wish i could @thinkyhead I am by no means a coder, I am very much a mooch of Marlin. I can tell you it happened After September 17th as that was my previous Pull. I try to move forward with the bugfix every couple of weeks so i can stay current-ish if i have to report a bug.

thinkyhead commented 5 years ago

Please ZIP your configurations and attach them to a reply, as requested in the issue template.

GhostlyCrowd commented 5 years ago

Marlin_config.zip Here you are.

thinkyhead commented 5 years ago

Meanwhile, give this version (October 17) a test and see if the issue still exists: https://github.com/MarlinFirmware/Marlin/archive/c36773bffba9f5300764238a06a7cb9e04c315da.zip

GMagician commented 5 years ago

@thinkyhead

void Endstops::enable(const bool onoff) {
  enabled = onoff;

  #if ENABLED(ENDSTOP_INTERRUPTS_FEATURE)
    update();
  #endif
}

I think this must be done also when no interrupt enabled because my last modification removed endstops status after home and they are not updated refreshed anymore when re-enabled (I use interrupt and it works fine)

GMagician commented 5 years ago

@GhostlyCrowd try to remove #if #endif lines in endstops.cpp fuction enable

Edit: but keep update line

GMagician commented 5 years ago

@thinkyhead even if this code:

      // At this point, we must ensure the movement about to execute isn't
      // trying to force the head against a limit switch. If using interrupt-
      // driven change detection, and already against a limit then no call to
      // the endstop_triggered method will be done and the movement will be
      // done against the endstop. So, check the limits here: If the movement
      // is against the limits, the block will be marked as to be killed, and
      // on the next call to this ISR, will be discarded.
      endstops.update();

should act the same...

Edit: this may work delayed because of filter threashold...

GhostlyCrowd commented 5 years ago

@thinkyhead

that build Does indeed act properly with my end stops

Yes i had to disable the endstop inttrupts because i have TMC2130's and for some reason with the latest TMC library it has conflicts with endstop inturrupts. There was an issue posted about it but only suggested disabling endstop inturrupts was the solution.

Edit, I am trying the code modification to endstops.cpp now

thinkyhead commented 5 years ago

@GMagician might have the right idea. Try this one: https://github.com/thinkyhead/Marlin/archive/f96be3c3dd36a13d9de3aae0f831bc604a1c2b73.zip

If that works, I'll submit a patch shortly.

GhostlyCrowd commented 5 years ago

@thinkyhead I tried what you suggested, it did not work.

I downloaded and compiled what @GMagician has given you to give to me, it also did not work.

Still same issue.

The October 17th pull you sent me DID work properly.

thinkyhead commented 5 years ago

Ok, since the version from Oct 17 worked, the change must have been after that. Please try this one, from October 28: https://github.com/MarlinFirmware/Marlin/archive/5ead0269676b80292a360e5b132204ebecdc40a5.zip

GhostlyCrowd commented 5 years ago

@thinkyhead October 28th Is broken, So something between October 17th and October 28th

GMagician commented 5 years ago

@GhostlyCrowd and @thinkyhead I think my solution doesn't work because of ENDSTOP_NOISE_THRESHOLD. Update must be called I think ENDSTOP_NOISE_THRESHOLD+1 times before it really updates endstops status

GMagician commented 5 years ago

@GhostlyCrowd you may try this to confirm my suspicion. Inside Enable function write:

update();
endstop_poll_count=1;
update();

this is no a solution but if it works then that's the way...

GhostlyCrowd commented 5 years ago

@GMagician I cannopt get it to compil adding

the code you suggested to endstop.cpp

Marlin\src\module\endstops.cpp: In static member function 'static void Endstops::enable(bool)':
Marlin\src\module\endstops.cpp:273:8: error: 'Update' was not declared in this scope
Update();
^
GMagician commented 5 years ago

@GhostlyCrowd sorry I mis typed is all lowercase

GhostlyCrowd commented 5 years ago

@GMagician That did indeed fix the problem

Is this a dirty work around or is it the fix?

GMagician commented 5 years ago

That just prove my doubt and works only for configurations with ENDSTOP_NOISE_THRESHOLD enabled, this can't ever be called 'dirty work around' it's worst than that ;-)

GhostlyCrowd commented 5 years ago

@GMagician Ahhh i understand, so whats the underlying issue you suspect?

GMagician commented 5 years ago

The problem is that to let machine works after a home, endstops status is cleared. In code when a new movement is inserted in planner there is a call to update endstops status but works only for "interrupt" mode, when you have ENDSTOP_NOISE_THRESHOLD, update must be called ENDSTOP_NOISE_THRESHOLD + 1 times to really update endstops but this call is done in temperature polling routine (1KHz) but it's too slow and will refresh endstop status too late.

My 'dirty' code speeds things up but I think that some different solution must be found.

Since this "delay" is done to filter noise is not either correct to update status like I did...maybe the enable call should wait all debounce time before move over (but in planner can't be done) but it's also true that if update is always called when endstops are enabled maybe is not needed anymore in planner.

GhostlyCrowd commented 5 years ago

makes sense why i didnt notice this until i had to disable the endstop interrupts because of the TMC library conflicting with it for some odd reason.

GMagician commented 5 years ago

@thinkyhead is it "legal" to replace update with something like forceupdate that wait until endstop_poll_count is 0? eg:

FORCE_INLINE void forceupdate() {
  #if ENDSTOP_NOISE_THRESHOLD
    do { update(); } while (endstop_poll_count);
  #else
    update();
  #endif
}

and remove update call inside planner? I think that this last call is not needed because of "always" up to date endstops status when they they are enabled (even with interrupts)

GhostlyCrowd commented 5 years ago

Ok so @GMagician your work around seemed to work for homing while the end stop is triggered but it has an adverse effect in the fat that now when my print finishes and it tries to park the X axis with G28 X its not seeing the endstop trigger and its crashing after the print now.

Edit: I'm doing another fast print now to verify this.

GMagician commented 5 years ago

@GhostlyCrowd that should not affect G28.

GhostlyCrowd commented 5 years ago

@GMagician You're correct I forgot to increase my stepper current after doing all these tests and it was the sound of it losing steps when bumping. Just did another print to verify.

GMagician commented 5 years ago

@GhostlyCrowd try replacing update call in enable with code proposed for forceupdate. I'm not sure such blocking code can be done without a safedelay? call

GhostlyCrowd commented 5 years ago

@GMagician I can do this tomorrow morning for you to test. I have left my workshop for the day.

thinkyhead commented 5 years ago

is it "legal" to replace update with something like forceupdate that wait until endstop_poll_count is 0?

I think this is starting to look like a hack to work around something fundamentally broken by the earlier update. @ejtagle worked hard to get the endstops checking correct, and the recent change didn't take into account the theory of its mechanism. I would recommend you revert back to before the change was made and reconsider what you were trying to accomplish with the recent change.

ejtagle commented 5 years ago

@thinkyhead ... you are right. We always (sometimes me too!) fail to get the whole picture here. The problem stated exists, but the fix must be done at a higher level - The stepper and the endstops class is not the problem.

How does this thing work ? 1) If ENDSTOPS_INTERRUPTs is disabled: -You just get polled endstops every 1khz - If there is NOISE FILTER enabled, then you can get a variable delay between switch press and actually taking it into account, otherwise actions are taken inmediately 2) If ENDSTOPS_INTERRUPTs is enabled, and no filtering, then the switchs are read and action is taken inmediately 3) If ENDSTOPS_INTERRUPTs is enabled, and filtering,, then each interrupts triggers periodic polling (1khz) until values settle down. Then polling ends.

What makes things more complex is that polling or updating state is only done when the endstops are enabled.... That is what is usually causing issues, as endstops, unless ALWAYS_ENABLED, will not update on changes

thinkyhead commented 5 years ago

Right, so… Instead of a simple enabled state, should we make it have a state 2 … where the endstops are being primed, and a state 1 … where the endstops are now being read for actual use?

ejtagle commented 5 years ago

We could try to keep just the state always updated, but not the actions. That should be easy

thinkyhead commented 5 years ago

That was how I previously tried to implement it. So if it's not that way now, it's a regression.

GMagician commented 5 years ago

@ejtagle is it legal to havew interrupt and filter? If I wants interrupt is because I want fast response if I have noise no interrupt is logic to me...

GMagician commented 5 years ago

@thinkyhead and @ejtagle we may also keep endstop always working when ENDSTOP_NOISE_THRESHOLD is enabled (this is the only situation where status is delayed) but not copy live_state to validated_live_state until endstops are enabled

ejtagle commented 5 years ago

@GMagician : Yes, it is legal to have interrupts with noise filtering. You can think it as a timer. The interrupt is used to start the timer that polls the endstops. And the timer autoshutdowns when the endstops are stable

@thinkyhead : Most of the issues with the endstops are caused, by, let´s call it, edge triggered actions. So, if updates are disabled (abort_enabled() returning false), then update() does not execute- There is a risk of losing edges (transitions from non triggered-triggered-non triggered) while disabled. (usually this is not a problem at all)

The other thing that sometimes has caused problems is that there is a delay between the endstop being triggered and Marlin considering it as triggered. This small delay can cause race conditions if movements are too fast, or too short.

This specially holds true when enabling endstops, as there must be some time before endstops stabilize to their correct values!

What is the possible fix for the G28 command ? - Well, i think the best approach is to slightly modify the flow:

When an G28 command is executed, the first step should be to enable endstops, then perform an small wait (100mS) to allow the noise filter to stabilize, and THEN start the homing process. Alternatively, do a 10mS delay, then wait until endstop_poll_count equals 0.

That is probably the best approach. And 100mS of extra delay when homing is innoticeable! :+1:

thinkyhead commented 5 years ago

So, if updates are disabled (abort_enabled() returning false), then update() does not execute

This is not so, at least not before the recent change. Temperature::isr calls poll() and (when ENDSTOP_INTERRUPTS_FEATURE is disabled) poll() always calls update(). And if ENDSTOP_NOISE_THRESHOLD is set then update() continues forward… Once the validated_live_state is tested for, then update() will return without triggering anything unless abort_enabled() is true.

thinkyhead commented 5 years ago

It seems to me that the testing of abort_enabled() at the very start of update() should not be there. The collection of the endstop state should be allowed to proceed no matter what. And then below, where it currently only does logic for ENDSTOP_NOISE_THRESHOLD it should also have a block to do logic when ENDSTOP_NOISE_THRESHOLD is not set. Specifically, just updating the live_state.

ejtagle commented 5 years ago

@thinkyhead : I think you are right. State should always be kept. The actions is what we disable

ejtagle commented 5 years ago

That should remove the processing delays

thinkyhead commented 5 years ago

The downside is that it means the Temperature ISR will always have the added overhead of checking all the endstop pins.

ejtagle commented 5 years ago

If you use the interrupt on change feature, update should ONLY be called when there is a change. If you are not using that feature, you get continuous polling, so you are right...

I see your point. The other choice would be to keep it as it is right now, but add the delays i suggested everytime endstops are enabled

This delay could be added to endstops::enable() and endstops::enable_locally() and document them as "stabilization delays"

thinkyhead commented 5 years ago

The forceupdate approach also makes sense… Keep calling update() until the polling count (if existing) goes down to 0. Then we know the endstop state is confirmed and recorded.

thinkyhead commented 5 years ago

if you are not using that feature, you get continuous polling,

Currently you would only get continuous polling when using the threshold.

ejtagle commented 5 years ago

Basically yes. There is something strange here... the update() call should be cancelling any movement against endstops...

thinkyhead commented 5 years ago

Perhaps the forceupdate only needs to be called when turning endstops/probe off. …and when the threshold is in use.

ejtagle commented 5 years ago

Maybe i know what is going on here... update() is only called when a change of endstops happen (if ENDSTOPS_INTERRUPTS) enabled. And, if a new movement is queued, update() will be called ONCE from the stepper ISR (because the endstop was already pressed and nothing is resetting the endstop_poll_count) - This is not enough due to the FILTER_THRESHOLD, that requires more calls in order to recognize the endstops as not triggered. ... But again, the movement should never be against the endstops... unless.. Let´s assume endstops were disabled... in that case, the update() call from the stepper is not enough to cancel the movement...

thinkyhead commented 5 years ago

True! So, have we confirmed the problem only occurs with ENDSTOP_INTERRUPTS?

thinkyhead commented 5 years ago

Oh, right. I see that the update call from stepper is only done when first fetching the block.

thinkyhead commented 5 years ago

So, do we make an exception for the call from stepper? i.e., Set the polling count to 1 before the call so that only one sample is needed before it's considered confirmed?