LinuxCNC / linuxcnc

LinuxCNC controls CNC machines. It can drive milling machines, lathes, 3d printers, laser cutters, plasma cutters, robot arms, hexapods, and more.
http://linuxcnc.org/
GNU General Public License v2.0
1.8k stars 1.15k forks source link

Freezing on some programs #1965

Open arabel1a opened 2 years ago

arabel1a commented 2 years ago

Here are the steps I follow to reproduce the issue:

  1. install and build linuxcnc-dev without libmodbus, --with-realtime=uspace
  2. use standart 9axis sim configuration, but copy all ini parameters from axis A to axis B (so B becomes a common axis, not locking indexer)
  3. increase a soft limits to a big number (+-100000 shoud be enought)
  4. add a M-functions M166, M167, M152, M151, M106 (no matter what they do, bug apperas even if they just echo "something" and exit 0
  5. Open a file "bad.ngc"

This is what I expected to happen:

Get an error or correct programm preview

This is what happened instead:

After opening attached programm, linuxcnc freezed using 100% of CPU, the whole memory (RAM and Swap ). When coursor in above the axis window, it looks like a "loading" circle. After clicking on "close button" and confirming closing, axis window dissappeared, but linuxcnc was not closed. No errors in terminal. Memory and CPU are still loaded, linuxcnc process does not respond to SIGTERM. Looks like i wrote uncorrect program, but anyway it is not a correct behavior IMHO.

After reboting, tried to open this program again. Error remains, but gives this in terminal: `Exception in Tkinter callback Traceback (most recent call last): File "/usr/lib/python3.10/tkinter/init.py", line 1921, in call return self.func(args) File "/home/arabella/progs/linuxcnc-dev/bin/axis", line 156, in General_Halt if not root_window.tk.call("nfdialog", ".error", ("Confirm Close"), text, "warning", 1, ("Yes"), ("No")): _tkinter.TclError: can't invoke "grab" command: application has been destroyed can't read "data(pages)": no such variable while executing "foreach page $data(pages) { Widget::destroy $path.f$page }" (procedure "NoteBook::_destroy" line 5) invoked from within "NoteBook::_destroy .#BWidget.#Class#NoteBook" (command bound to event) invalid command name "139654618640960error_task" while executing "139654618640960error_task" ("after" script) invalid command name "139655025579136update" while executing "139655025579136update" ("after" script) ^C!!!emc/rs274ngc/gcodemodule.cc: parse_file() f=/home/arabella/Documents/codes/LIST2NGC/examples/bad.ngc !!!interp_error=1 result=0 last_sequence_number=115 task: 64804 cycles, min=0.000006, max=0.009901, avg=0.001101, 0 latency excursions (> 10x expected cycle time of 0.001000s) Exception in Tkinter callback Traceback (most recent call last): File "/home/arabella/progs/linuxcnc-dev/bin/axis", line 1286, in open_file_guts result, seq = o.load_preview(f, canon, initcodes, interpname) File "/home/arabella/progs/linuxcnc-dev/lib/python/rs274/glcanon.py", line 1870, in load_preview result, seq = gcode.parse(f, canon, args) RuntimeError: parse_file interp_error

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/arabella/progs/linuxcnc-dev/bin/axis", line 1303, in open_file_guts notifications.add("error", str(e)) File "/home/arabella/progs/linuxcnc-dev/bin/axis", line 342, in add self.place(relx=1, rely=1, y=-20, anchor="se") File "/usr/lib/python3.10/tkinter/init.py", line 2477, in place_configure self.tk.call( _tkinter.TclError: can't invoke "place" command: application has been destroyed

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/lib/python3.10/tkinter/init.py", line 1921, in call return self.func(*args) File "/home/arabella/progs/linuxcnc-dev/bin/axis", line 2233, in open_file commands.open_file_name(f) File "/home/arabella/progs/linuxcnc-dev/bin/axis", line 2252, in open_file_name open_file_guts(f) File "/home/arabella/progs/linuxcnc-dev/bin/axis", line 1314, in open_file_guts root_window.tk.call("destroy", ".info.progress") _tkinter.TclError: can't invoke "destroy" command: application has been destroyed ^CShutting down and cleaning up LinuxCNC... ^CTraceback (most recent call last): File "/home/arabella/progs/linuxcnc-dev/bin/axis-remote", line 28, in import tkinter File "/usr/lib/python3.10/tkinter/init.py", line 44, in TkVersion = float(_tkinter.TK_VERSION) KeyboardInterrupt Shutting down and cleaning up LinuxCNC... ^CShutting down and cleaning up LinuxCNC... ^CShutting down and cleaning up LinuxCNC... Note: Using POSIX non-realtime ^CShutting down and cleaning up LinuxCNC... ^CShutting down and cleaning up LinuxCNC... ^CShutting down and cleaning up LinuxCNC... Note: Using POSIX non-realtime ^CShutting down and cleaning up LinuxCNC... Note: Using POSIX non-realtime ^CShutting down and cleaning up LinuxCNC... Note: Using POSIX non-realtime `

It worked properly before this:

Yes, it normally opens a simpler programs

Information about my hardware and software:

non-realtime kernel, no additional hardware and no real machine -- just a linuxcnc on a common pc

bad.ngc.zip

arabel1a commented 2 years ago

Well, this program has an infinite loop so freezing is ok. But why so much memory usage?

petterreinholdtsen commented 2 years ago

Could this be related to issue #1146? I tested using "rs274 -g bad.ngc", but did not get the endless stream of instructions:

executing 1 N..... USE_LENGTH_UNITS(CANON_UNITS_MM) 2 N..... SET_G5X_OFFSET(1, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000) 3 N..... SET_G92_OFFSET(0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000) 4 N..... SET_XY_ROTATION(0.0000) 5 N..... SET_FEED_REFERENCE(CANON_XYZ) 6 N..... ON_RESET() 7 N1000 MESSAGE(" MAINPROGRAMMENUMBER,P1") 8 N2000 MESSAGE(" DIMENSIONSOFSHEET:0.80X407X1250") 9 N3000 MESSAGE(" MATERIALID:ST37-08") 10 N10000 MESSAGE(" RUND10--10.00") 11 N11000 MESSAGE(" SQUARE10.0--10.00") 12 N12000 MESSAGE(" RECHTECK37X5--37.00--5.00") 13 N13000 MESSAGE(" RECHTECK76.2X5--76.20--5.00") 14 N15000 USE_LENGTH_UNITS(CANON_UNITS_MM) 15 N16000 SET_FEED_RATE(108167.0000) 16 N16000 STRAIGHT_FEED(0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000) 17 N19000 MESSAGE(" ZEROPOINT7150") 18 N20000 MESSAGE(" LOADSHEET") Unknown m code used: M151 N24001 M151 19 N..... ON_RESET() 20 N..... ON_RESET()

I did not test with all the other setup you described.

Perhaps you can make a test case similar to tests/interp/g71-endless-loop/ to trigger the error?

-- Happy hacking Petter Reinholdtsen

arabel1a commented 2 years ago

not get the endless stream of instructions

This should be because you have not M151 defined, so interpreter stops at first M151 : ---> Unknown m code used: M151 ---> N24001 M151 ---> 19 N..... ON_RESET()

When i execute " rs274 -g bad.ngc " , it makes the same output as yours. How can i specify M-func file folder to rs274?

arabel1a commented 2 years ago

Well, take a look on this file: (similar to prevous but without M and T codes) When I execute "rs274 -g bad.ngc" , it is generating strings in infinite loop, like it should, memory usage remains constant. When i open it in axis gui, it freezes, increasing it's memory usage (about 50 MB/sec), so after several minutes it fills all the RAM and swap without opportunity to close and empty memory, so i have to reboot. bad.ngc.zip

petterreinholdtsen commented 2 years ago

[arabel1a]

Well, i commented all lines with M and T functions and run "rs274 -g bad.ngc". It falls intro infinite loop ( as it shuold do ) but does not increases memory usage. So, looks like this issue caused by axis preview, not by rs274 backend.

If there is an infinite loop, it will cause the memory usage of axis, which reads the instructions from the RS274 parser...

-- Happy hacking Petter Reinholdtsen

arabel1a commented 2 years ago

[arabel1a] Well, i commented all lines with M and T functions and run "rs274 -g bad.ngc". It falls intro infinite loop ( as it shuold do ) but does not increases memory usage. So, looks like this issue caused by axis preview, not by rs274 backend. If there is an infinite loop, it will cause the memory usage of axis, which reads the instructions from the RS274 parser... -- Happy hacking Petter Reinholdtsen

sounds logical, but probably it needs kind of limit of memory usage or endless loop detector :)

petterreinholdtsen commented 2 years ago

[arabel1a]

sounds logical, but probably it needs kind of limit of memory usage or endless loop detector :)

I do not know the code enough to tell what is needed to solve it. But the first step is as I suggested to write a test that can reproduce the issue, like I did for G71. It allow those that understand the code to easily trigger the problem and hopefully find a fix for it.

-- Happy hacking Petter Reinholdtsen