LinuxCNC / linuxcnc

LinuxCNC controls CNC machines. It can drive milling machines, lathes, 3d printers, laser cutters, plasma cutters, robot arms, hexapods, and more.
http://linuxcnc.org/
GNU General Public License v2.0
1.81k stars 1.16k forks source link

interp error in 2.9.3, started somewhere between 2.9.3-83-g1d5d02d836 and 2.9.3-2099-g9f5a1c8f50 #3174

Open Lcvette opened 1 week ago

Lcvette commented 1 week ago

Here are the steps I follow to reproduce the issue:

  1. start axis sim from terminal
  2. look for USRMOT: ERROR command 30 timeout and emcMotionInit: emcTrajInit failed
  3. load the tool table with tools for a program and then load the program, on screen error message occurs: "parse_file interp_error"
  4. Terminal error shows as:

    py3dev@py3dev:~/linuxcnc/configs/sim.axis$ linuxcnc axis.ini

LINUXCNC - 2.9.3-2099-g9f5a1c8f50

Machine configuration directory is '/home/py3dev/linuxcnc/configs/sim.axis'

Machine configuration file is 'axis.ini'

Starting LinuxCNC...

linuxcncsvr (8811) emcsvr: machine 'LinuxCNC-HAL-SIM-AXIS' version '1.1'

linuxcnc TPMOD=tpmod HOMEMOD=homemod EMCMOT=motmod

Note: Using POSIX realtime

milltask (8824) task: machine 'LinuxCNC-HAL-SIM-AXIS' version '1.1'

halui (8826) halui: machine 'LinuxCNC-HAL-SIM-AXIS' version '1.1'

Found file(lib): /usr/share/linuxcnc/hallib/core_sim.hal

Found file(lib): /usr/share/linuxcnc/hallib/sim_spindle_encoder.hal

Found file(lib): /usr/share/linuxcnc/hallib/axis_manualtoolchange.hal

Found file(lib): /usr/share/linuxcnc/hallib/simulated_home.hal

Found file(lib): /usr/share/linuxcnc/hallib/check_xyz_constraints.hal

Found file(REL): ./cooling.hal

USRMOT: ERROR: command 30 timeout

emcMotionInit: emcTrajInit failed

note: MAXV max: 5.000 units/sec 300.000 units/min

note: LJOG max: 5.000 units/sec 300.000 units/min

note: LJOG default: 0.250 units/sec 15.000 units/min

note: jog_order='XYZ'

note: jog_invert=set()

!!!emc/rs274ngc/gcodemodule.cc: parse_file() f=/home/py3dev/linuxcnc/nc_files/aa_error_test.ngc

!!!interp_error=1 result=2 last_sequence_number=16

It worked properly here: 2.9.3-83-g1d5d02d836

It shows error here: 2.9.3-2099-g9f5a1c8f50 (and possibly in between but i didn't test any of those)

systeminfo

rodw-au commented 1 week ago

My take on this is you should not be using the Secret Buildbot Debs at http://buildbot2.highlab.com/ for version 2.9. The current release 2.9.3 is set by a tag "v2.9.3" in git and the Debs are available on the official Linuxcnc repository. We say that bug fixes will be added to the release version but until and if v2.9.4 is released you can't depend on it. You can follow heading 7.1 in the Getting Linuxcnc docs to restore the sources files to point to toe correct repo.

In summary. only use the secret Buildbot for 2.10 master branch and use the released Debs for 2.9.3

In my view, the developers got into some bad habits when we maintained two versions for such a long time. I don't understand why there has been a continual stream of commits to 2.9. It seems to be way more than bug fxes pending the release of 2.9.4. If that is required, we should have released 2.10 and get back to better code integrity.

rene-dev commented 1 week ago
  1. is probably not related to the issue you are having. you are using a stock sim config? can you share the tooltable and program?
Lcvette commented 1 week ago
  1. is probably not related to the issue you are having. you are using a stock sim config? can you share the tooltable and program?

yes sim config for axis, files below:

test_files.zip

Lcvette commented 1 week ago

My take on this is you should not be using the Secret Buildbot Debs at http://buildbot2.highlab.com/ for version 2.9. The current release 2.9.3 is set by a tag "v2.9.3" in git and the Debs are available on the official Linuxcnc repository. We say that bug fixes will be added to the release version but until and if v2.9.4 is released you can't depend on it. You can follow heading 7.1 in the Getting Linuxcnc docs to restore the sources files to point to toe correct repo.

In summary. only use the secret Buildbot for 2.10 master branch and use the released Debs for 2.9.3

In my view, the developers got into some bad habits when we maintained two versions for such a long time. I don't understand why there has been a continual stream of commits to 2.9. It seems to be way more than bug fxes pending the release of 2.9.4. If that is required, we should have released 2.10 and get back to better code integrity.

sudo apt install linuxcnc-uspace

is this the correct release for lockdown 2.9.3 from apt?

rodw-au commented 1 week ago

sudo apt install linuxcnc-uspace

Yes. provided your /etc/apt sources files are set as per the install ISO defaults, otherwise you will get a very old version on Debian12 but a more current version (possibly 2.93) on yet to be released Debian13. (testing). There is no automatic buildbot for Debian. The source has to be uploaded to them.

rmu75 commented 1 week ago

changing tool numbers to something below 24 makes the error go away... is there some (new) limit how high tool number can be?

Lcvette commented 1 week ago

changing tool numbers to something below 24 makes the error go away... is there some (new) limit how high tool number can be?

that number seems to be a moving target and isn't repeatable, the first person who noticed it the threshhold was below 199, whe i tested it was below 105, so something fishy is happening i couldn't see or find. I spent 2 days thinking it was in our build trying to find where we broke to find there was no change that caused it. thats when we noticed it went away on different lcnc builds.

andypugh commented 1 week ago

If the problem was introduced between two commits then "git bisect" will find the problematic commit pretty quickly.

(I might find time, but I have a lot on. Mainly at the moment I am getting my late mother's house ready for sale. It was built around 1450...)

-- atp

rmu75 commented 1 week ago

debugging this, I get

!!!emc/rs274ngc/gcodemodule.cc: parse_file() f=/home/robert/tmp/test_files/aa_error_test.ngc
!!!interp_error=1 result=2 last_sequence_number=16
task: main loop took 0.014381 seconds
Traceback (most recent call last):
  File "/home/robert/CNC/linuxcnc-master/bin/axis", line 1089, in change_tool
    StatMixin.change_tool(self, pocket)
  File "/home/robert/CNC/linuxcnc-master/lib/python/rs274/interpret.py", line 151, in change_tool
    self.tools[0] = self.tools[idx]
                    ~~~~~~~~~~^^^^^
IndexError: list index out of range

https://github.com/LinuxCNC/linuxcnc/blob/master/lib/python/rs274/interpret.py#L151

seems that code there assumes tool index = tool number. also, gcodemodule.cc needs to be fixed so comment is consistent with code around line 917 and exception is reported.

rmu75 commented 1 week ago

I'm not able to find commit 9f5a1c8f50 but will try to search from v2.9.3 release tag.

rmu75 commented 1 week ago

can't reproduce on 2.9

Sigma1912 commented 1 week ago

My tooltable (6 lines):

T1  P1  Z1.9968  D0.125  ;1/8 end mill
T4  P4  Z3.125  D0.25  ;1/4 endmill
T6  P6  Z2.25  D0.25  ;1/4 chamfer mill
T7  P7  Z3.555  D3.937  ;100mm facemill
T68  P68  Z2.875  D0.257  ;.257 drill
T220  P220  Z2.958  D0.375  ;3/8 endmill

in the GCODE I have 'T6 M6' (which works) and later 'T7 M6' (which fails)

Seems that 'idx' in that section of 'interpret.py' refers to the line in the tooltable.

Debugging gives :

DEBUG: idx:  6
i= 0
self.tools[i]:  (-1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0)
DEBUG: self.tools[i]: tool_result(id=220, xoffset=0.0, yoffset=0.0, zoffset=2.958, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.375, frontangle=0.0, backangle=0.0, orientation=0)
i= 1
self.tools[i]:  tool_result(id=1, xoffset=0.0, yoffset=0.0, zoffset=1.9968, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.125, frontangle=0.0, backangle=0.0, orientation=0)
DEBUG: self.tools[i]: tool_result(id=220, xoffset=0.0, yoffset=0.0, zoffset=2.958, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.375, frontangle=0.0, backangle=0.0, orientation=0)
i= 2
self.tools[i]:  tool_result(id=4, xoffset=0.0, yoffset=0.0, zoffset=3.125, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.25, frontangle=0.0, backangle=0.0, orientation=0)
DEBUG: self.tools[i]: tool_result(id=220, xoffset=0.0, yoffset=0.0, zoffset=2.958, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.375, frontangle=0.0, backangle=0.0, orientation=0)
i= 3
self.tools[i]:  tool_result(id=6, xoffset=0.0, yoffset=0.0, zoffset=2.25, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.25, frontangle=0.0, backangle=0.0, orientation=0)
DEBUG: self.tools[i]: tool_result(id=220, xoffset=0.0, yoffset=0.0, zoffset=2.958, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.375, frontangle=0.0, backangle=0.0, orientation=0)
i= 4
self.tools[i]:  tool_result(id=7, xoffset=0.0, yoffset=0.0, zoffset=3.555, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=3.937, frontangle=0.0, backangle=0.0, orientation=0)
DEBUG: self.tools[i]: tool_result(id=220, xoffset=0.0, yoffset=0.0, zoffset=2.958, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.375, frontangle=0.0, backangle=0.0, orientation=0)
i= 5
self.tools[i]:  tool_result(id=68, xoffset=0.0, yoffset=0.0, zoffset=2.875, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.257, frontangle=0.0, backangle=0.0, orientation=0)
DEBUG: self.tools[i]: tool_result(id=220, xoffset=0.0, yoffset=0.0, zoffset=2.958, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.375, frontangle=0.0, backangle=0.0, orientation=0)
i= 6
self.tools[i]:  tool_result(id=220, xoffset=0.0, yoffset=0.0, zoffset=2.958, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.375, frontangle=0.0, backangle=0.0, orientation=0)
DEBUG: self.tools[i]: tool_result(id=220, xoffset=0.0, yoffset=0.0, zoffset=2.958, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.375, frontangle=0.0, backangle=0.0, orientation=0)
DEBUG: idx:  7
i= 0
self.tools[i]:  tool_result(id=220, xoffset=0.0, yoffset=0.0, zoffset=2.958, aoffset=0.0, boffset=0.0, coffset=0.0, uoffset=0.0, voffset=0.0, woffset=0.0, diameter=0.375, frontangle=0.0, backangle=0.0, orientation=0)
!!!emc/rs274ngc/gcodemodule.cc: parse_file() f=/home/user/git/linuxcnc-master-official/configs/sim/axis/nc_files/aa_error_test.ngc
!!!interp_error=1 result=2 last_sequence_number=117
rmu75 commented 1 week ago

yes with some of rene's tool table refactoring something got lost. really confusing mix of tool number, pocket number and tool table index.

something like

diff --git a/lib/python/rs274/interpret.py b/lib/python/rs274/interpret.py
index 5eba95d246..8815f7f869 100644
--- a/lib/python/rs274/interpret.py
+++ b/lib/python/rs274/interpret.py
@@ -140,8 +140,9 @@ class StatMixin:
         self.tools = list(s.tool_table)
         self.random = r

-    def change_tool(self, idx):
+    def change_tool(self, tool_nr):
         global tool_in_spindle
+        idx = self.get_index(tool_nr)
         if self.random:
             self.tools[0], self.tools[idx] = self.tools[idx], self.tools[0]
             tool_in_spindle = idx
@@ -158,6 +159,14 @@ class StatMixin:
             return tuple(self.tools[idx])
         return empty_spindle_data

+    def get_index(self, tool_nr):
+        index = 1
+        for tool in self.tools[1:]:
+            if tool.id == tool_nr:
+                return index
+            index = index + 1
+        return 0
+
     def get_external_angular_units(self):
         return self.s.angular_units or 1.0

should do but I'm not sure that covers all cases.

I will prepare a PR and try to include some test cases.

rmu75 commented 1 week ago

in the GCODE I have 'T6 M6' (which works) and later 'T7 M6' (which fails)

does that change to tool 6 or to tool 220? or worse, change to tool 6 and apply offsets of tool 220? does it work on 2.9?

Sigma1912 commented 1 week ago

Actual tool change seems not affected (displayed tool number and tool offsets are correct). I believe this bug may only affect the gremlin preview / Gcode Stats.

I'm testing on master currently.

Sigma1912 commented 1 week ago

Just checked out 2.9.3 and it works for me there. It seems that interpret.py 'change_tool(self, idx)' gets the correct line number. So something must have removed that looked which line in the tool table held the requested tool number.

rmu75 commented 1 week ago

Just checked out 2.9.3 and it works for me there.

it should also work on tip of 2.9 branch. which raises the question what is going on in 2.9.3-2099-g9f5a1c8f50 whatever that is.

Sigma1912 commented 1 week ago

I know, I'm puzzled as to why this bug would be reported on Version 2.9.3.

6XoCtujg2C0gne commented 1 week ago

I reported this or something very similar in my forum post “Inconsistent values from Versaprobe https://forum.linuxcnc.org/qtvcp/54169-inconsistent-values-from-versaprobe ” in mid October.

My opinion says that without a delimited tool table file its very confusing to manually add a record/row.

From: Sigma1912 @.> Sent: Monday, November 11, 2024 10:12 AM To: LinuxCNC/linuxcnc @.> Cc: Subscribed @.***> Subject: Re: [LinuxCNC/linuxcnc] interp error in 2.9.3, started somewhere between 2.9.3-83-g1d5d02d836 and 2.9.3-2099-g9f5a1c8f50 (Issue #3174)

I know, I'm puzzled as to why this bug would be reported on Version 2.9.3.

— Reply to this email directly, view it on GitHub https://github.com/LinuxCNC/linuxcnc/issues/3174#issuecomment-2468536491 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFIKPKFZAZA5H4ZCUB6W3L2ADJMPAVCNFSM6AAAAABRQOIOKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRYGUZTMNBZGE . You are receiving this because you are subscribed to this thread. https://github.com/notifications/beacon/ABFIKPNIJECUVEJTHNFDOED2ADJMPA5CNFSM6AAAAABRQOIOKWWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUTELQKW.gif Message ID: @. @.> >

rmu75 commented 1 week ago

I reported this or something very similar in my forum post “Inconsistent values from Versaprobe https://forum.linuxcnc.org/qtvcp/54169-inconsistent-values-from-versaprobe ” in mid October.

I don't see how that is remote similar -- can you please open a new issue if some problem still exists, and please post complete version info. thanks.

6XoCtujg2C0gne commented 1 week ago

I think that my point is, the tool table format changed. Perhaps my oversight but, I did note that

the previous code wrote out each and every “column” in the tool table whereas now, not all columns are written

to the table.

I didn’t see where there was a change in code but, like I said it might be my oversight.

Apologies for polluting the thread. 😊

From: Robert Schöftner @.> Sent: Monday, November 11, 2024 1:24 PM To: LinuxCNC/linuxcnc @.> Cc: Andy (gardenweazel) Lewis @.>; Comment @.> Subject: Re: [LinuxCNC/linuxcnc] interp error in 2.9.3, started somewhere between 2.9.3-83-g1d5d02d836 and 2.9.3-2099-g9f5a1c8f50 (Issue #3174)

I reported this or something very similar in my forum post “Inconsistent values from Versaprobe https://forum.linuxcnc.org/qtvcp/54169-inconsistent-values-from-versaprobe ” in mid October.

I don't see how that is remote similar -- can you please open a new issue if some problem still exists, and please post complete version info. thanks.

— Reply to this email directly, view it on GitHub https://github.com/LinuxCNC/linuxcnc/issues/3174#issuecomment-2468876049 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFIKPIV3VRRTMHQMYW6WE32AD74HAVCNFSM6AAAAABRQOIOKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRYHA3TMMBUHE . You are receiving this because you commented. https://github.com/notifications/beacon/ABFIKPJ3EYZNWWUNPMI3HUT2AD74HA5CNFSM6AAAAABRQOIOKWWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUTFAHRC.gif Message ID: @. @.> >

Sigma1912 commented 6 days ago

Just found the same problem here (although it does not produce an error):
https://github.com/LinuxCNC/linuxcnc/blob/master/lib/python/rs274/glcanon.py#L301

in 2.9: 'self.tool_list' contains the tool table line [0, 5, 4, 6, 2, 1, 3, 0] in master: 'self.tool_list' contains the tool number [0, 68, 48, 220, 4, 1, 22, 0]

[edit] Looks like this is used for 'gcode_properties'

Sigma1912 commented 6 days ago

Actually this seems like an improvement in this particular case as it now displays actual tool numbers (left) rather than tool table index (right):

gcode_properties