LinuxCNC / linuxcnc

LinuxCNC controls CNC machines. It can drive milling machines, lathes, 3d printers, laser cutters, plasma cutters, robot arms, hexapods, and more.
http://linuxcnc.org/
GNU General Public License v2.0
1.73k stars 1.13k forks source link

Broken homing error handling! If one joint fails to home the other joints will continue anyways. #2983

Open JimmyRigIt opened 1 month ago

JimmyRigIt commented 1 month ago

Here are the steps I follow to reproduce the issue:

My machines, z homes first. Then x any y together. Noticed this issue when my z stepper wire broke.

  1. Unplug the stepper wire going to the joint which homes first. In my case z axis
  2. Home the machine.
  3. Linuxcnc will throw an error saying it failed to home z but.....
  4. X and y axis will start homing after the error shows up causing the head to crash.

This is what I expected to happen:

Z joint tries to home, fails, pops up a warning saying z joint failed to home. Machine then stops.

Their should be an option in ini file to disable this function for those without homing switches.

This is what happened instead:

Machine crashes head cause the z axis was not moved up before x and y due to broken stepper motor wire

It worked properly before this:

Don't think this worked (If the behavior changed after making a particular change in hardware or software, describe the change you think is responsible. E.g., "after upgrading from LinuxCNC 2.7.3 to 2.7.4")

Information about my hardware and software:

Seems to be an issue across linuxcnc. Tried it on a few machines and was able to replicate. Other users are reporting the same issue. https://forum.linuxcnc.org/plasmac/48236-qtplasmac-2-10-arc-ok-thcad-thc-and-thcad-ohmic-troubleshooting-help-needed?start=40

JimmyRigIt commented 1 month ago

Might have found the issue.

in the homing.c file the "home_abort" only aborts the current joint not all of them.

here is the code throwing the error on the screen when a joint fails to home, think it calls home abort for the joint

/ check for reached end of move / if (! (&joints[jno])->free_tp.active) { / reached end of move without hitting switch / (&joints[jno])->free_tp.enable = 0; rtapi_print_msg(RTAPI_MSGERR,("j%d end of move in home state %d"),jno, H[jno].home_state); H[jno].home_state = HOME_ABORT; return 1; // abort reqd }

However the home abort case only does the current joint not all of them.

case HOME_ABORT: H[joint_num].homing = 0; H[joint_num].homed = 0; H[joint_num].joint_in_sequence = 0; joint->free_tp.enable = 0; H[joint_num].home_state = HOME_IDLE; H[joint_num].index_enable = 0; immediate_state = 1; break;

simple loop might fix the issue??? hope im understanding this correctly. e.g.

case HOME_ABORT: for (int i = 0; i < all_joints; i++) { H.homing = 0; H.homed = 0; H.joint_in_sequence = 0; H.home_state = HOME_IDLE; H.index_enable = 0; joints.free_tp.enable = 0; } immediate_state = 1; break;

petterreinholdtsen commented 1 month ago

[JimmyRigIt]

Might have found the issue.

in the homing.c file the "home_abort" only aborts the current joint not all of them.

Thank you for looking into this.

What do you believe would be the correct action, in this and the generic case?

-- Happy hacking Petter Reinholdtsen

JimmyRigIt commented 1 month ago

the code i wrote above wasn't doing it, Phillc54 seems to have come up with a solution, i'll test on my machines over the next few days and report back. We are unsure of all the implications this could have (if any). Ideally this works with no side effects and that's all the corrective action needed.

[phillc54] Something like this seems to work but I have not comprehensively tested for any side effects. / check for reached end of move / if (! (&joints[jno])->free_tp.active) { / reached end of move without hitting switch / (&joints[jno])->free_tp.enable = 0; rtapi_print_msg(RTAPI_MSGERR,("j%d end of move in home state %d, homing aborted"),jno, H[jno].home_state); for (int j = 0; j < all_joints; j++) { if (!H[j].homed) { H[j].home_state = HOME_ABORT; } } return 1; // abort reqd }

JimmyRigIt commented 1 month ago

That code is working on my machine! No ill effects so far

andypugh commented 1 month ago

Thank you for looking into this. What do you believe would be the correct action, in this and the generic case?

I think that the correct action should be to abort all homing. (and this is pretty much what the message says is going to happen).

The user still has the option of homing joint-by-jpint if they really want to, but why would they want to?