thelastWallE / OctoprintKlipperPlugin

A plugin for a better integration of Klipper into OctoPrint.
GNU Affero General Public License v3.0
65 stars 14 forks source link

Eventual complete system crash, seemingly triggered by starting a print, due to badly handled config file additions #44

Closed codefaux closed 3 years ago

codefaux commented 3 years ago

OctoPrint version v1.5.3 OctoPi 0.17.0 RPi3 B+ r1.3 OctoKlipper (version 0.3.7....or is it 0.3.7.1....or is it 0.3.7.2....it's the last one, this boot)

OctoPrint, OctoKlipper, and everything associated with them were running flawlessly. I WAS using OctoKlipper 0.3.5 (EDIT: Seemingly may have been using 0.3.3? 0.3.2? I'm not sure anymore, but there was no syntax-related intrusion) and everything was grand. I managed my config file across multiple files myself, nobody stuck their nose in it, and my printer has run literally dozens of kilograms in that configuration over the last few months.

Then I updated OctoKlipper to 0.3.7 and it seemed okay...then 0.3.7.1 came out immediately and I went "oh no" and didn't update. Then 0.3.7.2 came out almost immediately again and I went "OH NO" and updated because I figured this time it HAD to be right, right?

No.

Now ONLY AFTER I START A PRINT (maybe in other cases, but in the typical "turn on, boot, load gcode, print" cycle it only happens AFTER I START A PRINT) OctoKlipper decides it's angry that my config file isn't right, and throws dialog boxes. I don't know how many of them; the RPi's CPU BUCKLES, my LCD stops responding, eventually OctoPrint stops printing but nothing even throws an error in the WebUI -- because it's frozen.

The error I see fading in before everything completely locks up is regarding some advanced TMC driver settings already existing. They've always existed, I don't know why OctoKlipper suddenly A) IS looking, B) NEEDS to be looking, C) cares so much it's destroying my prints, and D) is SPECIFICALLY looking AFTER I start a PRINT.

I'm reverting. I'll give you any other information I can, but the logs overfilled my filesystem and the spinloop made debugging it without a hard power cycle impossible; ssh wouldn't respond. I'm not letting it happen a third time (I NOTICED the second time) but I'll try to answer any further questions you might have.

My question, right now; As a user with a stable system who neither needs nor wants help with their config file: What motivation do I have to update OctoKlipper at this time? Is there any improvement beyond "intrusive and questionably better config file handling" or is that the only one right now? I appreciate the motivations behind the effort, but I'm not into beta testing with my printer, so unless there's a need I'm reverting to OctoKlipper 0.3.5 and staying there.

Specificially, reverting because I literally cannot print in this condition.

thelastWallE commented 3 years ago

I can only reproduce this if the configs have bad code in it and the errors are coming from klipper over the terminal because there is invalid code in it. OctoKlipper is only looking if the main config is okay then you save it. You need to revert to 0.3.3 for disabling the parsing check of the config file. I would like to see any log file you can give. klippy.log would be good to see. Best regards and sorry for any trouble you had.

codefaux commented 3 years ago

We'll start by saying I apologize for my tone. Yesterday was an 'everything keeps breaking' day and I really just wanted to print some stuff for a project and pull a win.

Yeah I don't understand. Today I can't reproduce it, and yesterday it happened four times and I could use the Plugin Manager to switch between versions 0.3.7.2 and 0.3.5 (I was using 0.3.5 as a 'revert' version despite apparently wanting 0.3.3) to trigger the bug. Every single time, after starting the print, I would see several dialogs pop up about duplicate settings, all would close except one, and the host CPU load would shoot up to 100%, holding there while the print got more and more jittery until it stopped. I rebooted between each version change, I even did a full power cycle.

I intended to collect logs for you now. This time, I can't get it to bug out though. I get -one- message about unsaved config (while not saving config, just looking around in the settings windows) and -one- about a duplicate entry.

I have one mistake in my config; I accidentally left a PWM_AUTOGRAD setting uncommented in an axis I had disabled while awaiting parts. Kudos on that. My bad. Obviously that hadn't been and isn't likely be causing this, but I fixed it none the less.

The mystifying part is, I -verified- before raising the issue. I try not to be that guy. Today I was; I apologize for that. Thank you for your work on the plugin, and your tolerance.

As a suggestion going forward -- consider it a feature request -- if you add new functionality in a future update, consider adding a config branch to disable it, and make it opt-in, instead of "this is your reality now."

Part of my overdriven iritation (over the last few weeks) has been several software suites updating, adding features that break things, and making them the default. As a developer, when adding a new thing, let the user decide if they want it, when at all possible. Even with the best of intentions, edge cases (my home turf) are overlooked and new features with the best intent often step on my tail.

thelastWallE commented 3 years ago

I plan to add a checkbox for the parsing check. The versions in the update plugin of OctoPrint are the release channels.
The Release Candidate channel is for testing. So this had already these new features in.
I will add a text to the popups of OctoKlipper to distinguish between other plugins/builtin popups. I don't get how it would popup just browsing the settings dialog. Would help to see the klippy.log If there are // lines in the terminal from klipper they would trigger the popups. the uncommented line would have triggered klipper.

codefaux commented 3 years ago

I appreciate the considerations going forward.

I don't have logs, and as I said I cannot replicate the issue.

Dismiss this report as unable to reproduce, as the single issue that stuck around has an obvious cause and did not contribute to whatever happened.

Thank you.