scottmudge / OctoPrint-MeatPack

Easy, fast, effective, and automatic g-code compression!
https://github.com/scottmudge/OctoPrint-MeatPack
BSD 4-Clause "Original" or "Old" License
122 stars 6 forks source link

MeatPack with Causes 175% cpu usage and crashes octoprint/serial during PID tuning #20

Closed GhostlyCrowd closed 3 years ago

GhostlyCrowd commented 3 years ago

This makes the PID "fail" as you never get the terminal output values so you can put them into eeprom.

Some times octo recovers and unfreezes, but you miss the pid info as the terminal doesn't output during the high cpu usage.

With meatpack disabled PID works fine CPU never goes above 8% usage @800mhz total. Meatpack on as you can see below its CPU flogged up to 175% cpu on a single thread @1.2GHz cores.

image

When Octoprint does manage to recover the Webui is responsive but the terminal stops accepting commands/gcode so the Octoprint service has to be restarted

Not sure if its the plugin or octo but its sure giving me grief.

As a side note Printing with Meatpack on is flawless, its only when PID tuning.

here is the example of my PID tune command.

M303 C25 E-1 S60 U

scottmudge commented 3 years ago

I cannot reproduce w/ the Prusa firmware, when benchmarked the CPU overhead is almost immeasurably small as a percentage.

I need more information about your system, logs, etc. What are you running OctoPrint on? What firmware are you using? What printer? Etc., etc.

I just sent the same command with MP enabled, same exact version of OctoPrint, no issues. It seems like an issue with the Marlin implementation of MP (which I do not maintain) not correctly parsing the command, so you may want to report it to them.

I'm going to guess it's due to the E-1 portion of the command, which might be parsed as an exponent. Which causes problems when whitespace removal is enabled in the MP settings.

Try disabling whitespace removal and see if it changes anything.

GhostlyCrowd commented 3 years ago

I cannot reproduce w/ the Prusa firmware, when benchmarked the CPU overhead is almost immeasurable as a percentage.

I need more information about your system, logs, etc. What are you running OctoPrint on? What firmware are you using? What printer? Etc., etc.

I just sent the same command with MP enabled, same exact version of OctoPrint, no issues. It seems like an issue with the Marlin implementation of MP (which I do not maintain) not correctly parsing the command, so you may want to report it to them.

I'm going to guess it's due to the E-1 portion of the command, which might be parsed as an exponent. Which causes problems when whitespace removal is enabled in the MP settings.

Try disabling whitespace removal and see if it changes anything.

I will try what you suggested first octoprint-logs.zip

here are the logs in the mean time at 2021-05-28 18:03:52,740 - Recv: T:26.02 /0.00 B:59.26 /0.00 @:0 B@:127 in the serial log (18:03) is when the cpu hits insanity and the terminal output freezes and octoprint will eventually unfreeze but it has to be rebooted to get serial to take commands again)

I'm also chatting in doscord with soem of the Octo guys, they dont really adress 3rd party plugin issues but they were curious enough to have a look at the logs as well.

THis is happening on Marlin buxfix as of May 23rd, and was also happening on a bug fix build from 2020, it also only seems to happen while PID tuning. Printing is fine.

I have no other 3rd party plugins enabled but meatpack to trouble shoot this and recreate the issue.

Below is sys info from the Pi.

browser.user_agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4506.0 Safari/537.36 Edg/92.0.901.0 connectivity.connection_check: 1.1.1.1:53 connectivity.connection_ok: true connectivity.enabled: true connectivity.online: true connectivity.resolution_check: octoprint.org connectivity.resolution_ok: true env.hardware.cores: 4 env.hardware.freq: 1200 env.hardware.ram: 1021054976 env.os.bits: 32 env.os.id: linux env.os.platform: linux env.plugins.pi_support.model: Raspberry Pi 3 Model B Plus Rev 1.3 env.plugins.pi_support.octopi_version: 0.18.0 env.plugins.pi_support.throttle_state: 0x0 env.python.pip: 20.3.3 env.python.version: 3.7.3 env.python.virtualenv: true octoprint.safe_mode: false octoprint.version: 1.6.1 printer.firmware: Marlin bugfix-2.0.x (May 23 2021 14:50:47) systeminfo.generator: systemapi

Curious as to why E-1 is an issue all of the sudden now, ive used the same macro several dozens of times successfully with meatpack before. I didnt see this behaviour untill the pi was updated. just curious coincidence?

GhostlyCrowd commented 3 years ago

I cannot reproduce w/ the Prusa firmware, when benchmarked the CPU overhead is almost immeasurably small as a percentage.

I need more information about your system, logs, etc. What are you running OctoPrint on? What firmware are you using? What printer? Etc., etc.

I just sent the same command with MP enabled, same exact version of OctoPrint, no issues. It seems like an issue with the Marlin implementation of MP (which I do not maintain) not correctly parsing the command, so you may want to report it to them.

I'm going to guess it's due to the E-1 portion of the command, which might be parsed as an exponent. Which causes problems when whitespace removal is enabled in the MP settings.

Try disabling whitespace removal and see if it changes anything.

Turning off whitespace removal did not change the behavior. It has locked up PID tuning and 2 threads are 100% cpu.

scottmudge commented 3 years ago

Hmmm, well the serial log indicates the Marlin FW parses the command correctly, and the log continues to output received data (from FW), so the serial interface of OctoPrint (which is the majority of the code the MP plugin hooks into) seems fine.

It's strange something host-side gets stuck. Without access to a local debugger to step through each thread, it's going to be very difficult to diagnose this. And unfortunately I cannot reproduce the issue.

Are you doing something unique when sending the command? Or just sending it through the terminal? Do other commands work? Can you slowly change/erase characters to see when it stop hanging? I'm wondering if it's something in the text, or if some part of OctoPrint is knowledgeable of the PID tuning and is waiting for something particular to be received.

Do you have any non-standard plugins installed besides MP?

GhostlyCrowd commented 3 years ago

Do you have any non-standard plugins installed besides MP?

Yes several they are all disabled thought I have no third party plugins enabled and am able to reproduce this problem every time

Are you doing something unique when sending the command? Or just sending it through the terminal?

I normally send the command with the macro plugin, but since its disabled i have been trying it to terminal by hand, same results.

Can you slowly change/erase characters to see when it stop hanging?

I will try now

I'm wondering if it's something in the text, or if some part of OctoPrint is knowledgeable of the PID tuning and is waiting for something particular to be received.

This i don't know i did convince a Octoprint dev to have a look at my logs even thought they don't typically look at issues based on third party plugins.

GhostlyCrowd commented 3 years ago

Rolled back to Octo 1.5.3 since my friend insisted it was only after updating. and so far 1.5.3 also does it.

Meatpack on octoprint uses 2 threads 100% and the PID tune fails.

Meatpack off, fine i dont understand

scottmudge commented 3 years ago

Very strange, looking at the plugin source code I don't even see a spot where it might hang. 95% of the plugin code is already executed before it even sends the PID command. And it's received fine, indicating the plugin is performing correctly. I'm not sure what's going on either.

If it's not too much trouble, it might be worth while trying a different Raspberry Pi or using a new SD card with a clean system image, to see if it's a broader configuration problem somewhere.

Based on the serial log, the printer itself is behaving normally. So it's definitely host-side.

GhostlyCrowd commented 3 years ago

Very strange, looking at the plugin source code I don't even see a spot where it might hang. 95% of the plugin code is already executed before it even sends the PID command. And it's received fine, indicating the plugin is performing correctly. I'm not sure what's going on either.

If it's not too much trouble, it might be worth while trying a different Raspberry Pi or using a new SD card with a clean system image, to see if it's a broader configuration problem somewhere.

Based on the serial log, the printer itself is behaving normally. So it's definitely host-side.

THats my next test, I couldnt grasp why meatpack would break pid tuning either.

GhostlyCrowd commented 3 years ago

I've sent a new Pi off to him, in the mean time removing meatpack is the fix for him. I got him to do a dd of his sd card, i want him to compress it and send it to me so i can try and break things on one of my setups.

This is what one of the Octo fellas said to me

`I suspect something is swallowing some serial comm somewhere - I'm seeing temperature reporting as if the PID tune finishes, but no ok or the usual 'Finished' messages.'

Does meatpack also interact with serial responses?

cp2004 commented 3 years ago

This is what one of the Octo fellas said to me

'I suspect something is swallowing some serial comm somewhere - I'm seeing temperature reporting as if the PID tune finishes, but no ok or the usual 'Finished' messages.'

said 'Octo fella' here: this probably has nothing to do with the issue here, since the high CPU usage is starting as the printer is heating, rather than at the end of the PID tune. You can tell the point that is can no longer keep up with the commands here:

2021-05-28 18:03:48,740 - Recv:  T:27.03 /0.00 B:56.88 /0.00 @:0 B@:127
2021-05-28 18:03:50,741 - Recv:  T:27.03 /0.00 B:58.18 /0.00 @:0 B@:127
2021-05-28 18:03:52,740 - Recv:  T:26.02 /0.00 B:59.26 /0.00 @:0 B@:127  <-- HERE
2021-05-28 18:08:25,380 - Recv:  T:25.00 /0.00 B:59.33 /0.00 @:0 B@:127
2021-05-28 18:08:25,394 - Recv:  T:27.03 /0.00 B:59.36 /0.00 @:0 B@:127
2021-05-28 18:08:25,399 - Recv:  T:26.02 /0.00 B:59.67 /0.00 @:0 B@:127
2021-05-28 18:08:25,403 - Recv:  T:26.02 /0.00 B:60.14 /0.00 @:0 B@:0
2021-05-28 18:08:25,407 - Recv:  T:26.02 /0.00 B:59.96 /0.00 @:0 B@:0
2021-05-28 18:08:25,411 - Recv:  T:27.03 /0.00 B:59.67 /0.00 @:0 B@:127

See there's the pause of 2 mins or so, where it can't read the serial port. It's then followed by processing as many commands as possible, probably up to the outgoing buffer of the printer, then it is back to business as usual.

I have no idea what's going on, and how processing temperature responses while there is a PID tune going on becomes so difficult. MeatPack is not that complicated that it has chances for this kind of bug... Let's hope you can figure out something here 👍

GhostlyCrowd commented 3 years ago

This is what one of the Octo fellas said to me

'I suspect something is swallowing some serial comm somewhere - I'm seeing temperature reporting as if the PID tune finishes, but no ok or the usual 'Finished' messages.'

said 'Octo fella' here: this probably has nothing to do with the issue here, since the high CPU usage is starting as the printer is heating, rather than at the end of the PID tune. You can tell the point that is can no longer keep up with the commands here:

2021-05-28 18:03:48,740 - Recv:  T:27.03 /0.00 B:56.88 /0.00 @:0 B@:127
2021-05-28 18:03:50,741 - Recv:  T:27.03 /0.00 B:58.18 /0.00 @:0 B@:127
2021-05-28 18:03:52,740 - Recv:  T:26.02 /0.00 B:59.26 /0.00 @:0 B@:127  <-- HERE
2021-05-28 18:08:25,380 - Recv:  T:25.00 /0.00 B:59.33 /0.00 @:0 B@:127
2021-05-28 18:08:25,394 - Recv:  T:27.03 /0.00 B:59.36 /0.00 @:0 B@:127
2021-05-28 18:08:25,399 - Recv:  T:26.02 /0.00 B:59.67 /0.00 @:0 B@:127
2021-05-28 18:08:25,403 - Recv:  T:26.02 /0.00 B:60.14 /0.00 @:0 B@:0
2021-05-28 18:08:25,407 - Recv:  T:26.02 /0.00 B:59.96 /0.00 @:0 B@:0
2021-05-28 18:08:25,411 - Recv:  T:27.03 /0.00 B:59.67 /0.00 @:0 B@:127

See there's the pause of 2 mins or so, where it can't read the serial port. It's then followed by processing as many commands as possible, probably up to the outgoing buffer of the printer, then it is back to business as usual.

I have no idea what's going on, and how processing temperature responses while there is a PID tune going on becomes so difficult. MeatPack is not that complicated that it has chances for this kind of bug... Let's hope you can figure out something here 👍

Just as a note, the same issues happen even when the temperature responses are "muted" in the octoprint terminal tab, turning them on was the easiest way for me to watch it freeze.

cp2004 commented 3 years ago

Just as a note, the same issues happen even when the temperature responses are "muted" in the octoprint terminal tab, turning them on was the easiest way for me to watch it freeze.

Supressing temperature responses is purely a frontend filter, it doesn't have any impact on the backend.

GhostlyCrowd commented 3 years ago

NEw Pi4B nnew sd imaged the sd over. same issue so its definitely something in software i have no idea what though. Its an Octoprint thread thats running off i cant tell what its doing beyond that

scottmudge commented 3 years ago

Which version of Python are you using, and which distro of linux have you imaged?

GhostlyCrowd commented 3 years ago

I cloned the SD, so my last idea is the OS was corrupt. I was using the 0.18.0 base image updated to 1.6.1 from 1.5.2> Its Buster Rasbian base Pythong 3.7.3

Im doing the same fresh install now. It has to be some sort of os corruption or something. if it does it again after all new hardware and all software fresh then i have no idea :/

GhostlyCrowd commented 3 years ago

And i just finished getting it all set and its Still doing it, im at the end of my rope

I resorted an octoprint backup, it essentially just downloads all the plugins and a few configs for said plugins are restored. Id be willing to share this with you. thats the only fragment of my old install was i didnt want to reinstall everything one by one.

But with that said remember that it does this with everything disabled except meatpack.

GhostlyCrowd commented 3 years ago

Sorry for the rapid posting, I figured it out I still don't know if its a meatpack issue or a octoprint core bug but the octoprint devs on discord said "even if it is octoprint it only occurs with the plugin. so it's the plugin devs responsibility to debug it and either fix their code or provide relevant info to core."

I deleted the 5.4GB of stl files the user had in their ~/octoprint/uploads/ folder (default upload folder for octo) and now everything is fine.

Does meatpack some how interface with the stl files?

Either way I can also reproduce the issue restoring the backup with STL files every time. I am also willing to share that for debugging let me know. The Octoprint devs have no interest in looking into if this is their problem as it only occurs with meat pack enabled.

scottmudge commented 3 years ago

Meatpack is completely unaware of everything in OctoPrint except the serial connection. There are no API hooks or interfaces to anything that involves the file system, STL files, etc.

My only guess is that it was some kind of strange memory problem, perhaps an overflow of some sort that only manifested a problem with a particular memory layout created by the constellation of plugins w/ MeatPack active. Not typical of python to have those problems, but certainly not unheard of.

Glad you figured it out though. Can I close this issue?

GhostlyCrowd commented 3 years ago

I consider this resolved, although i feel like this might be an underlying issue with octoprint, but given the push back i got from the devs who responded to me on discord i dont think i can move this any further to a resolution.

Ultimately the amount, size, complexity and so on of stored files shouldn influence this plugin at all, and given your plugin does nothing of the sort with interacting with the files directly for any reasons during PID tuning one would deduce its a corner case fringe bug on octoprint core.

But im just an end user so ultimately i do not know.

Thanks for everything though!

skl111 commented 3 years ago

I am experiencing something similar, using and SKR Mini E3 V2.0 Ender 5 Pro, it crashes and the board reboots and the PIDTune never finishes. Tried with with different meatpack plugin options enabled, and then only left the music option enabled, crash reboot Disabled the plugin PID Tune works fine.

Send: M303 E0 C10 S240 Recv: PID Autotune start doesn't even get to temp before it crashes.

Adding more info

env.hardware.cores: 4 env.hardware.freq: 1500 env.hardware.ram: 3959304192 env.os.bits: 32 env.os.id: linux env.os.platform: linux env.plugins.pi_support.model: Raspberry Pi 4 Model B Rev 1.2 env.plugins.pi_support.octopi_version: 0.18.0 env.plugins.pi_support.throttle_state: 0x0 env.python.pip: 21.2.4 env.python.version: 3.7.3 env.python.virtualenv: true octoprint.safe_mode: false octoprint.version: 1.6.1 printer.firmware: Marlin bugfix-2.0.x (Sep 20 2021 16:36:41) systeminfo.generator: systemapi

Octopi 0.18.0 OctoPrint 1.6.1

Let me know if I should open a new issue, if you don't want to reopen this one.

scottmudge commented 3 years ago

The OctoPrint plugin would be unable to cause this behavior. It is probably a defect in firmware, or perhaps less likely in the OctoPrint installation, as GhostlyCrowd discovered.

Regardless, while disabling the plugin appears to affect the behavior, there is nothing inside the plugin itself (this repository) that would affect PID tuning.

I do not maintain the Marlin port of the MeatPack feature, so you may want to post an issue there if the OctoPrint logs don't report any obvious signs of errors. And given the firmware resets itself, my first guess would be a parsing error on the firmware side.

skl111 commented 3 years ago

@scottmudge Alright, I will look into doing that, as I do think it is a marlin bug, and I was unsure if you had anything to do with the meatpack feature in it. Other than PIDTune, I have had no problems, just disable it when I run PID tune and re-enable afterwards, knock on wood!

skl111 commented 2 years ago

FYI the OctoPi Itself goes offline via wireless network sometimes when this issue occurs and never comes back online at the same time the printer reboots until I power cycle the PI