Z-Bolt / OctoScreen

LCD touch interface for our OctoPrint
GNU Affero General Public License v3.0
722 stars 118 forks source link

Octoscreen resets every few minutes #75

Closed nicwilson58 closed 3 years ago

nicwilson58 commented 4 years ago

I have this installed and working on a Genuine Rasberry Pi 3B+ with a 5" TFT touchscreen. Everything works yet every few minutes, sometimes longer, the octoprint program resets, I get a very brief view of a "VNCConnect" screen then the Octoprint screen resumes again working. The same thing will then occur a couple of minutes later, sometimes longer.

NOTE: its only octoprint program resetting not the raspberry PI, Octoprint remains running and camera feed remains running. What can I do to provide you with more information?

Cheers NicW

redwiz666 commented 4 years ago

Installed Octoscreen tonight and I have the same problem. Do you have any luck figuring out whats causing it?

nicwilson58 commented 4 years ago

Hi,not yet I installed the whole thing again from scratch just to see but its the same,  oddly enough sometimes it does not reset for ages,  then will do it several times in a row.   I am running an original adafruit 2.5A power supply so its not power so not sure why it does it.

-------- Original Message -------- Subject: Re: [Z-Bolt/OctoScreen] Octoscreen resets every few minutes (#75) From: redwiz666 notifications@github.com Date: Tue, January 14, 2020 1:02 am To: Z-Bolt/OctoScreen OctoScreen@noreply.github.com Cc: nicwilson58 nic@nicwilson.com, Author author@noreply.github.com

Installed Octoscreen tonight and I have the same problem. Do you have any luck figuring out whats causing it? —You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or unsubscribe.

redwiz666 commented 4 years ago

Looks like i found the issue. xset is not installed on the distro. Install x11-xserver-utils and it will fix it.

nicwilson58 commented 4 years ago

Mine already had that so it’s not my issue

Sent from my iPhone

On 14 Jan 2020, at 12:15 pm, redwiz666 notifications@github.com wrote:

 Looks like i found the issue. xset is not installed on the distro. Install x11-xserver-utils and it will fix it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

nicwilson58 commented 4 years ago

is anyone monitoring this project at all? I may need to uninstall this and go back to TouchUI if there is no support for it. I have rebuilt the Octoprint and Octoscreen from scratch three times now and it all works including Octoscreen but it just randomly resets the Octoscreen, I see a quick flash of the VNC Connect window then Octoscreen loads again and runs for a random amount of time between a minute and an hour before resetting again, OctoPrint itself its very stable.

EvilGremlin commented 4 years ago

Dude, at least show us some logs.

nicwilson58 commented 4 years ago

Dude, at least explain how

From: EvilGremlin notifications@github.com Sent: Friday, 17 January 2020 3:44 PM To: Z-Bolt/OctoScreen OctoScreen@noreply.github.com Cc: nicwilson58 nic@nicwilson.com; Author author@noreply.github.com Subject: Re: [Z-Bolt/OctoScreen] Octoscreen resets every few minutes (#75)

Dude, at least show us some logs.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Z-Bolt/OctoScreen/issues/75?email_source=notifications&email_token=AN7NSLOCYID7TN2NF27FAJDQ6FARPA5CNFSM4KFVQ5P2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJGQTGA#issuecomment-575474072 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AN7NSLMIH3IF7HXIRO4HM5LQ6FARPANCNFSM4KFVQ5PQ . https://github.com/notifications/beacon/AN7NSLMGGBSML2GWYNPDAODQ6FARPA5CNFSM4KFVQ5P2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJGQTGA.gif

nicwilson58 commented 4 years ago

The first post asked what could be done to give you more information it was ignored.

From: EvilGremlin notifications@github.com Sent: Friday, 17 January 2020 3:44 PM To: Z-Bolt/OctoScreen OctoScreen@noreply.github.com Cc: nicwilson58 nic@nicwilson.com; Author author@noreply.github.com Subject: Re: [Z-Bolt/OctoScreen] Octoscreen resets every few minutes (#75)

Dude, at least show us some logs.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Z-Bolt/OctoScreen/issues/75?email_source=notifications&email_token=AN7NSLOCYID7TN2NF27FAJDQ6FARPA5CNFSM4KFVQ5P2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJGQTGA#issuecomment-575474072 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AN7NSLMIH3IF7HXIRO4HM5LQ6FARPANCNFSM4KFVQ5PQ . https://github.com/notifications/beacon/AN7NSLMGGBSML2GWYNPDAODQ6FARPA5CNFSM4KFVQ5P2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJGQTGA.gif

EvilGremlin commented 4 years ago

for starts: systemctl status octoscreen.service also specify log location in /etc/octoscreen/config

nicwilson58 commented 4 years ago

Not much in the log, but here is a log entry from when a reset occurs.

time="2020-01-18T15:29:22+10:00" level=info msg=10938 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:24+10:00" level=info msg=10940 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:26+10:00" level=info msg=10942 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:28+10:00" level=info msg=10944 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:30+10:00" level=info msg=10946 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:40+10:00" level=info msg="OctoPrint's config file found: \"/home/pi/.octoprint/config.yaml\"" file=main.go func=main.readConfig$ time="2020-01-18T15:29:40+10:00" level=info msg="Using \"http://localhost:5000\" as server address" file=main.go func=main.init.0 line=52 time="2020-01-18T15:29:41+10:00" level=debug msg="New background task started" file=common.go func="ui.(BackgroundTask).Start" line=136 time="2020-01-18T15:29:42+10:00" level=error msg="unexpected status code: 404" file=ui.go func="ui.(UI).loadSettings" line=194 time="2020-01-18T15:29:42+10:00" level=info msg="Printing a job" file=ui.go func="ui.(UI).verifyConnection" line=169 time="2020-01-18T15:29:42+10:00" level=debug msg="New background task started" file=common.go func="ui.(BackgroundTask).Start" line=136 time="2020-01-18T15:29:42+10:00" level=info msg=10958 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:44+10:00" level=info msg=10960 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:46+10:00" level=info msg=10962 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:48+10:00" level=info msg=10964 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:50+10:00" level=info msg=10966 file=print_status.go func="ui.(printStatusPanel).updateJob" line=277 time="2020-01-18T15:29:52+10:00" level=info msg=10968 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277

gdachs commented 4 years ago

@nicwilson58 what do you mean with "not much in the log"? For me it explains everything. time="2020-01-18T15:29:42+10:00" level=error msg="unexpected status code: 404" file=ui.go func="ui.(*UI).loadSettings" line=194 Obviously OctoPrint is not running, or can't be reached.

gdachs commented 4 years ago

Now that it is clear that is not an OctoScreen issue you can even close this here.

morgendagen commented 4 years ago

I am having the same issue, and do not believe it is due to the 404 (naturally, I may be mistaken). Below is the combined output from syslog and octoprint.log. Before and after these log lines are more log lines from updateJob.

==> /var/log/syslog <==
Jan 20 22:59:43 octopi xinit[1240]: time="2020-01-20T22:59:43+01:00" level=info msg=4477 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277
Jan 20 22:59:45 octopi xinit[1240]: time="2020-01-20T22:59:45+01:00" level=info msg=4479 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277
Jan 20 22:59:55 octopi systemd[1]: octoscreen.service: Watchdog timeout (limit 10s)!
Jan 20 22:59:55 octopi systemd[1]: octoscreen.service: Killing process 1240 (xinit) with signal SIGABRT.
Jan 20 22:59:55 octopi xinit[1240]: (EE)
Jan 20 22:59:55 octopi xinit[1240]: (EE) Backtrace:
Jan 20 22:59:55 octopi xinit[1240]: (EE)
Jan 20 22:59:55 octopi xinit[1240]: (EE) Received signal 6 sent by process 1, uid 0
Jan 20 22:59:55 octopi xinit[1240]: (EE)
Jan 20 22:59:55 octopi xinit[1240]: Fatal server error:
Jan 20 22:59:55 octopi xinit[1240]: (EE) Caught signal 6 (Aborted). Server aborting
Jan 20 22:59:55 octopi xinit[1240]: (EE)
Jan 20 22:59:55 octopi xinit[1240]: (EE)
Jan 20 22:59:55 octopi xinit[1240]: Please consult the The X.Org Foundation support
Jan 20 22:59:55 octopi xinit[1240]: #011 at http://wiki.x.org
Jan 20 22:59:55 octopi xinit[1240]:  for help.
Jan 20 22:59:55 octopi xinit[1240]: (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
Jan 20 22:59:55 octopi xinit[1240]: (EE)
Jan 20 22:59:55 octopi systemd[1]: octoscreen.service: Killing process 1241 (Xorg) with signal SIGABRT.
Jan 20 22:59:55 octopi systemd[1]: octoscreen.service: Killing process 1252 (OctoScreen) with signal SIGABRT.
Jan 20 22:59:55 octopi systemd[1]: octoscreen.service: Main process exited, code=killed, status=6/ABRT
Jan 20 22:59:55 octopi xinit[1240]: (EE) Server terminated with error (1). Closing log file.
Jan 20 22:59:55 octopi systemd[1]: octoscreen.service: Failed with result 'watchdog'.
Jan 20 22:59:55 octopi systemd[1]: octoscreen.service: Service RestartSec=100ms expired, scheduling restart.
Jan 20 22:59:55 octopi systemd[1]: octoscreen.service: Scheduled restart job, restart counter is at 4.
Jan 20 22:59:55 octopi systemd[1]: Stopped OctoScreen.
Jan 20 22:59:55 octopi systemd[1]: Starting OctoScreen...
Jan 20 22:59:55 octopi xinit[1529]: X.Org X Server 1.20.4
Jan 20 22:59:55 octopi xinit[1529]: X Protocol Version 11, Revision 0
Jan 20 22:59:55 octopi xinit[1529]: Build Operating System: Linux 4.15.0-48-generic armv8l Raspbian
Jan 20 22:59:55 octopi xinit[1529]: Current Operating System: Linux octopi 4.19.75-v7+ #1270 SMP Tue Sep 24 18:45:11 BST 2019 armv7l
Jan 20 22:59:55 octopi xinit[1529]: Kernel command line: coherent_pool=1M 8250.nr_uarts=1 bcm2708_fb.fbwidth=800 bcm2708_fb.fbheight=600 bcm2708_fb.fbswap=1 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000  dwc_otg.lpm_enable=0 console=tty1 console=ttyS0,115200 root=/dev/mmcblk0p2 rootfstype=ext4 elevator=deadline rootwait fbcon=map:10 fbcon=font:ProFont6x11 logo.nologo
Jan 20 22:59:55 octopi xinit[1529]: Build Date: 05 June 2019  12:49:54PM
Jan 20 22:59:55 octopi xinit[1529]: xorg-server 2:1.20.4-1+rpt1 (https://www.debian.org/support)
Jan 20 22:59:55 octopi xinit[1529]: Current version of pixman: 0.36.0
Jan 20 22:59:55 octopi xinit[1529]: #011Before reporting problems, check http://wiki.x.org
Jan 20 22:59:55 octopi xinit[1529]: #011to make sure that you have the latest version.
Jan 20 22:59:55 octopi xinit[1529]: Markers: (--) probed, (**) from config file, (==) default setting,
Jan 20 22:59:55 octopi xinit[1529]: #011(++) from command line, (!!) notice, (II) informational,
Jan 20 22:59:55 octopi xinit[1529]: #011(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
Jan 20 22:59:55 octopi xinit[1529]: (==) Log file: "/var/log/Xorg.0.log", Time: Mon Jan 20 22:59:55 2020
Jan 20 22:59:55 octopi xinit[1529]: (==) Using config directory: "/etc/X11/xorg.conf.d"
Jan 20 22:59:55 octopi xinit[1529]: (==) Using system config directory "/usr/share/X11/xorg.conf.d"
Jan 20 22:59:55 octopi xinit[1529]: modprobe: FATAL: Module g2d_23 not found in directory /lib/modules/4.19.75-v7+
Jan 20 22:59:56 octopi xinit[1529]: time="2020-01-20T22:59:56+01:00" level=info msg="OctoPrint's config file found: \"/home/pi/.octoprint/config.yaml\"" file=main.go func=main.readConfig line=102
Jan 20 22:59:56 octopi xinit[1529]: time="2020-01-20T22:59:56+01:00" level=info msg="Using \"http://localhost:5000\" as server address" file=main.go func=main.init.0 line=52
Jan 20 22:59:56 octopi xinit[1529]: time="2020-01-20T22:59:56+01:00" level=info msg="Found API key at \"/home/pi/.octoprint/config.yaml\" file" file=main.go func=main.init.0 line=58
Jan 20 22:59:57 octopi xinit[1529]: time="2020-01-20T22:59:57+01:00" level=debug msg="New background task started" file=common.go func="ui.(*BackgroundTask).Start" line=136

==> octoprint.log <==
2020-01-20 22:59:57,333 - tornado.access - WARNING - 404 POST /api/plugin/zbolt_octoscreen (127.0.0.1) 14.02ms

==> /var/log/syslog <==
Jan 20 22:59:57 octopi xinit[1529]: time="2020-01-20T22:59:57+01:00" level=error msg="unexpected status code: 404" file=ui.go func="ui.(*UI).loadSettings" line=194

==> octoprint.log <==
2020-01-20 22:59:57,351 - tornado.access - WARNING - 404 POST /api/plugin/zbolt_octoscreen (127.0.0.1) 12.57ms

==> /var/log/syslog <==
Jan 20 22:59:57 octopi xinit[1529]: time="2020-01-20T22:59:57+01:00" level=info msg="Printing a job" file=ui.go func="ui.(*UI).verifyConnection" line=169
Jan 20 22:59:57 octopi xinit[1529]: time="2020-01-20T22:59:57+01:00" level=debug msg="New background task started" file=common.go func="ui.(*BackgroundTask).Start" line=136
Jan 20 22:59:57 octopi xinit[1529]: time="2020-01-20T22:59:57+01:00" level=info msg=4490 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277
Jan 20 22:59:59 octopi xinit[1529]: time="2020-01-20T22:59:59+01:00" level=info msg=4493 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277
Jan 20 23:00:01 octopi xinit[1529]: time="2020-01-20T23:00:01+01:00" level=info msg=4495 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277
Jan 20 23:00:03 octopi xinit[1529]: time="2020-01-20T23:00:03+01:00" level=info msg=4497 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277
Jan 20 23:00:05 octopi xinit[1529]: time="2020-01-20T23:00:05+01:00" level=info msg=4498 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277
Jan 20 23:00:07 octopi systemd[1]: Started OctoScreen.
Jan 20 23:00:07 octopi xinit[1529]: time="2020-01-20T23:00:07+01:00" level=info msg=4501 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277
Jan 20 23:00:09 octopi xinit[1529]: time="2020-01-20T23:00:09+01:00" level=info msg=4503 file=print_status.go func="ui.(*printStatusPanel).updateJob" line=277
gdachs commented 4 years ago

@morgendagen beside the number 404 this has nothing in common. @nicwilson58 has this error when OctoScreen tries to access Octoprint. In your log tornado in OctoPrint tries to access OctoScreen.

morgendagen commented 4 years ago

@gdachs OK, I'll open a new issue, although the symptoms are exactly the same (OctoScreen restarting every few minutes).

hunagyp commented 4 years ago

just a note: I had such 404 + octoscreen continuous restart issue when I first setup octoprint (+klipper) and octoscreen. Although not during prints jobs. It happened for me right at the startup of Octoscreen. Octoscreen started, timeout was reached then automatically restarted - in an endless loop... For me the solution was a settings change in Octoprint. Serial connection -> Behavior tab -> "What to do on firmware error..." and select "Cancel any ongoing..." option. It was also mentioned klipper's install doc https://github.com/KevinOConnor/klipper/blob/master/docs/Installation.md

In the Settings tab, navigate to the "Behavior" sub-tab and select the "Cancel any ongoing prints but stay connected to the printer" option. Click "Save".

Maybe it helps for others? Maybe with Marlin too? I do not know. Anyone can try.

nicwilson58 commented 4 years ago

I believe you were too quick to say close this as its clear its not an Octoscreen issue, as there certainly is an issue with Octoscreen. I have finished a 2-day print with Octoprint which went flawlessly, I would not usually use octoprint for such a long print, but did just to see if it had issues which it did not at all, Octoscreen however reset itself periodically through the two days. I then removed Octoscreen and installed TouchUI and did another 1 day print, both Octoprint and TouchUI were flawless. So I then took a new SD card installed Octoprint and Octoscreen from scratch using the packages and ran another test, once again Octoprint just works without issue, and Octoscreen installed from your package resets itself periodically.

nicwilson58 commented 4 years ago

I fixed one of the errors:

time="2020-01-18T15:29:40+10:00" level=info msg="OctoPrint's config file found: "/home/pi/.octoprint/config.yaml"" file=main.go func=main.readConfig$

The file permissions were set correctly, and the file was there and accessible, so I tried setting its permissions for everyone and this error is now gone and the log reports the file is found.

the other error:

time="2020-01-18T15:29:42+10:00" level=error msg="unexpected status code: 404" file=ui.go func="ui.(*UI).loadSettings" line=194

still escapes me.

manbehindthemadness commented 4 years ago

If we look here: file=ui.go func="ui.(*UI).loadSettings" line=194

func (m *UI) loadSettings() { n, err := (&octoprint.GetSettingsRequest{}).Do(m.Printer) if err != nil { Logger.Error(err) return } m.Settings = n

We see a simple function that is trying to load the contents of the octoprint settings file, and if an error occurs to raise it to the log. The status code 404 is universal for "not found".

Now if we look at the octoprint log we will see: tornado.access - WARNING - 404 POST /api/plugin/zbolt_octoscreen (::1) 51.97ms

We can test and confirm this problem using: curl -D - -H 'X-Api-Key: YOUR_KEY' http://localhost:5000/api/plugin/zbolt_octoscreen Output: HTTP/1.1 404 NOT FOUND X-Robots-Tag: noindex, nofollow, noimageindex Content-Length: 9 Content-Type: text/html; charset=utf-8 X-Clacks-Overhead: GNU Terry Pratchett Cache-Control: max-age=0

So it's clear that Octoscreen is attempting to reach out to an octoprint plugin that's not installed. A quick Google search reveals: https://github.com/Z-Bolt/OctoPrint-Z-Bolt-OctoScreen so we clone and install: git clone https://github.com/Z-Bolt/OctoPrint-Z-Bolt-OctoScreen.git source ~/OctoPrint/venv/bin/activate # Wherever your OP virtualenv is located cd OctoPrint-Z-Bolt-OctoScreen python setup.py install sudo systemctl restart octoprint.service

Enjoy.

manbehindthemadness commented 4 years ago

To further this issue, after the initial problem was solved it seems that after startup the plugin causes repeated connection timeouts for several minutes even after octoprint has come online.

time="2020-01-30T22:39:06Z" level=error msg="Post http://localhost:5000/api/plugin/zbolt_octoscreen: net/http: request canceled (Client.Timeout exceeded while awaiting headers)" file=ui.go func="ui.(UI).loadSettings" line=194 time="2020-01-30T22:39:09Z" level=debug msg="Unexpected error: Get http://localhost:5000/api/connection: net/http: request canceled (Client.Timeout exceeded while awaiting headers)" file=ui.go func="ui.(UI).verifyConnection" line=153

I feel the only way to gracefully solve this is to have the connection logic gracefully wait and retry instead of cratering the process and reloading after each failure.

::: Update ::: This was due to a local software issue and I have resolved it; However, I believe the statement regarding graceful reconnects instead of re-loading is still valid.

nicwilson58 commented 4 years ago

thank you for this, I have installed the plugin as you mentioned.

boardthatpowder commented 4 years ago

Fixed it for me too. Thanks!!

manbehindthemadness commented 4 years ago

Sadly I notice that with the Octoprint-Octoscreen plugin running the CPU consumption of Octoprint more than doubles causing serial timeouts during prints. This might only apply to beaglebones as they are single cored. Ill post if I can find a workaround.

nicwilson58 commented 4 years ago

Sadly I notice that with the Octoprint-Octoscreen plugin running the CPU consumption of Octoprint more than doubles causing serial timeouts during prints. This might only apply to beaglebones as they are single cored. Ill post if I can find a workaround.

Just for information, I am running it on a Pi 3 B+ and so far its working fine.

manbehindthemadness commented 4 years ago

https://github.com/Z-Bolt/OctoPrint-Z-Bolt-OctoScreen/issues/3

manbehindthemadness commented 4 years ago

The performance degradation has crippled my machine entirely. This must be a BBB specific issue; however, I cannot figure out where all the resources are going.

nicwilson58 commented 4 years ago

For your info the resources use of the 3B+ while being used are: Processor 8-19% Memory 34% core temp 61c speed 1.4Ghz base clock 0.6GHz Processes 122 Threads 4

synman commented 4 years ago

I have serious doubts about the octoscreen plugin resolving this issue... all the code I've reviewed indicates you're seeing a placebo effect and it's just a matter of time until your instance hangs again.

FWIW, I've yet to completely resolve this issue for my own installation. At this very moment I have a ton of debug statements in the background threads waiting for a failure.

Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="Updating watchdog" file=ui.go func="ui.(*UI).verifyConnection" line=124
Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="Watchdog updated" file=ui.go func="ui.(*UI).verifyConnection" line=126
Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="getting printer status" file=ui.go func="ui.(*UI).verifyConnection" line=131
Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="got printer status" file=ui.go func="ui.(*UI).verifyConnection" line=133
Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="newState == oldState" file=ui.go func="ui.(*UI).verifyConnection" line=166
Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="connection verified" file=ui.go func="ui.(*UI).update" line=233
Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="updating temperature" file=print_status.go func="ui.(*printStatusPanel).updateTemp
Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="temperature updated" file=print_status.go func="ui.(*printStatusPanel).updateTempe
Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="updating job state" file=print_status.go func="ui.(*printStatusPanel).updateJob" l
Feb  1 13:09:45 octopi4 xinit[5542]: time="2020-02-01T13:09:45-05:00" level=info msg="job state updated" file=print_status.go func="ui.(*printStatusPa
manbehindthemadness commented 4 years ago

Yeah I noticed that on my end as well, I am curious what version of OctoPrint everyone is using, I am on 1.3.12 Master I did some experimentation and it seems that this happens when I enable any API enabled plugin. If I just stick to the defaults plus OctoKlippy it's solid as a rock.

synman commented 4 years ago

I'm pulling master, building from source, and dpkg installing the resulting deb package....

No more OctoPrint OOMs / restarts while inactive issues for me with the hacks I mention (basically just commenting out some code) in the other issue thread.

Pi 4 B here ... no overclocking ... fresh buster (octopi) image as my base.

It's so freaking annoying. I'm printing widgets / spinners waiting for this thing to hang again and it's just trucking along.

manbehindthemadness commented 4 years ago

What does the OctoScreen plugin actually do? The interface seems to run thin and clean and functions just fine (except the issue in the thread topic) without it.

synman commented 4 years ago

It looks like the basis for a settings interface with some theming work intertwined (that currently breaks any of your own theming or theming plugins like themify).

It provides an API back to OctoScreen for retrieving the values of its settings and notifications interfaces.

The settings are:

So basically just being able to override some OctoScreen behaviors via the OctoPrint Settings UI, something specific to a tool changer (possibly part of the z-bolt reference printer), and rearranging / restoring the default OctoScreen UI elements flow / structure if you've rewired them via custom (octoprint-tft) styling maybe?

All very much still a work in progress it seems.

And the notifications interface appears to be tied specifically to tool change events... which for those of us that only have one extruder is pretty pointless. I suspect this has much to do with supporting custom Z-Bolt printers / etc (maybe with some GCODE traps -- although I didn't find any in the plugin source... just the interface for sending the notifications to octoscreen:

        self._message = {
            "time": time(),
            "text": message['text'],
            "title": message['title']
        }

I've yet to find a single line. of code would indicate how this plugin could just magically fix hangs we're seeing.

synman commented 4 years ago

It's snippets like this that tell me this is still a work in progress (I like his humor):

        self._message = {
            "time": time(),
            "text": "Wake up, Neo...\nThe Matrix has you...\nFollow the white rabbit.\n\n\nKnock, Knock, Neo."
        }

and

    def on_api_get(self, request):
        return flask.jsonify(printer_name="test2")

suggests he's still tinkering :)

manbehindthemadness commented 4 years ago

I see what you are saying, I wonder just as an experiment if we could just comment out the code where OctoScreen reaches out to the plugin and see if that changes behavior.

boardthatpowder commented 4 years ago

Earlier I reported that the octoscreen plugin had fixed my issue. Resets are still occurring, just far and few between now. Previously where it was as frequent as every few minutes, it’s now a couple of times per hr.

manbehindthemadness commented 4 years ago

I noticed that as well, but in my case it's trying to hit the OctoprintAPI often enough that on occasion system load is causing it to time out and crash.

synman commented 4 years ago

I've got all the plugin calls commented out.... got hang about an hour ago (about 10 minutes prior to a widget print finishing):

Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="verifying connection" file=ui.go func="ui.(*UI).update" line=231
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="Updating watchdog" file=ui.go func="ui.(*UI).verifyConnection" line=124
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="Watchdog updated" file=ui.go func="ui.(*UI).verifyConnection" line=126
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="getting printer status" file=ui.go func="ui.(*UI).verifyConnection" line=131
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="got printer status" file=ui.go func="ui.(*UI).verifyConnection" line=133
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="newState == oldState" file=ui.go func="ui.(*UI).verifyConnection" line=166
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="connection verified" file=ui.go func="ui.(*UI).update" line=233
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="updating temperature" file=print_status.go func="ui.(*printStatusPanel).updateTemp
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="temperature updated" file=print_status.go func="ui.(*printStatusPanel).updateTempe
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="updating job state" file=print_status.go func="ui.(*printStatusPanel).updateJob" l
Feb  1 13:53:43 octopi4 xinit[5542]: time="2020-02-01T13:53:43-05:00" level=info msg="job state updated" file=print_status.go func="ui.(*printStatusPanel).updateJob" li
Feb  1 13:54:43 octopi4 systemd[1]: octoscreen.service: Watchdog timeout (limit 1min)!

I have every octoprint API call surrounded with debug output. The hang is not happening inside any of the background updates. It's happening within Go itself.

The ui.go background thread failed to fire. This guy:

    ui.b = NewBackgroundTask(time.Second*2, ui.update)

update()calls verifyConnection()and verifyConnection()updates watchdog. You can see above without any rhyme or reason the background task just stopped firing.

synman commented 4 years ago

I wasn't much of a Go programmer before this little exercise, but I may just put it on my resume by the time I'm done here.

Currently researching

type BackgroundTask struct {
    close chan bool

    d       time.Duration
    task    func()
    running bool
    sync.Mutex
}

func NewBackgroundTask(d time.Duration, task func()) *BackgroundTask {
    return &BackgroundTask{
        task: task,
        d:    d,

        close: make(chan bool, 1),
    }
}

func (t *BackgroundTask) Start() {
    t.Lock()
    defer t.Unlock()

    Logger.Info("New background task started")
    go t.loop()

    t.running = true
}

func (t *BackgroundTask) Close() {
    Logger.Info("Background task closing")
    t.Lock()
    defer t.Unlock()
    if !t.running {
        return
    }

    t.close <- true
    t.running = false
}

func (t *BackgroundTask) loop() {
    t.execute()

    ticker := time.NewTicker(t.d)
    defer ticker.Stop()
    for {
        select {
        case <-ticker.C:
            t.execute()
        case <-t.close:
            Logger.Info("Background task closed")
            return
        }
    }
}

func (t *BackgroundTask) execute() {
    _, err := glib.IdleAdd(t.task)
    if err != nil {
        log.Fatal("IdleAdd() failed:", err)
    }
}

I wonder if we have a glib issue...

manbehindthemadness commented 4 years ago

Yeah I feel you, python man myself.

synman commented 4 years ago

I'm going to guess that the main thread is hung. Notice how task execute and IdleAdd keep running every two seconds but the actual tasks (verifyConnection & updateJob) don't run?

glib IdleAdd inserts a tasks into a processes main thread. Guess I need to go hunt that down now.

Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="verifying connection" file=ui.go func="ui.(*UI).update" line=231
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="Updating watchdog" file=ui.go func="ui.(*UI).verifyConnection" line=124
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="Watchdog updated" file=ui.go func="ui.(*UI).verifyConnection" line=126
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="getting printer status" file=ui.go func="ui.(*UI).verifyConnection" line=131
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="got printer status" file=ui.go func="ui.(*UI).verifyConnection" line=133
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="newState == oldState" file=ui.go func="ui.(*UI).verifyConnection" line=166
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="connection verified" file=ui.go func="ui.(*UI).update" line=233
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="about to update temperature" file=print_status.go func="ui.(*printStatusPanel).update" line=178
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="updating temperature" file=print_status.go func="ui.(*printStatusPanel).updateTemperature" line=186
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="temperature updated" file=print_status.go func="ui.(*printStatusPanel).updateTemperature" line=188
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="about to update job" file=print_status.go func="ui.(*printStatusPanel).update" line=180
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="updating job state" file=print_status.go func="ui.(*printStatusPanel).updateJob" line=257
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="job state updated" file=print_status.go func="ui.(*printStatusPanel).updateJob" line=259
Feb  1 17:41:19 octopi4 xinit[15799]: time="2020-02-01T17:41:19-05:00" level=info msg="print status update complete" file=print_status.go func="ui.(*printStatusPanel).update" line=182
Feb  1 17:41:21 octopi4 xinit[15799]: time="2020-02-01T17:41:21-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:21 octopi4 xinit[15799]: time="2020-02-01T17:41:21-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:21 octopi4 xinit[15799]: time="2020-02-01T17:41:21-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:21 octopi4 xinit[15799]: time="2020-02-01T17:41:21-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:23 octopi4 xinit[15799]: time="2020-02-01T17:41:23-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:23 octopi4 xinit[15799]: time="2020-02-01T17:41:23-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:23 octopi4 xinit[15799]: time="2020-02-01T17:41:23-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:23 octopi4 xinit[15799]: time="2020-02-01T17:41:23-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:25 octopi4 xinit[15799]: time="2020-02-01T17:41:25-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:25 octopi4 xinit[15799]: time="2020-02-01T17:41:25-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:25 octopi4 xinit[15799]: time="2020-02-01T17:41:25-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:25 octopi4 xinit[15799]: time="2020-02-01T17:41:25-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:27 octopi4 xinit[15799]: time="2020-02-01T17:41:27-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:27 octopi4 xinit[15799]: time="2020-02-01T17:41:27-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:27 octopi4 xinit[15799]: time="2020-02-01T17:41:27-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:27 octopi4 xinit[15799]: time="2020-02-01T17:41:27-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:29 octopi4 xinit[15799]: time="2020-02-01T17:41:29-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:29 octopi4 xinit[15799]: time="2020-02-01T17:41:29-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:29 octopi4 xinit[15799]: time="2020-02-01T17:41:29-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:29 octopi4 xinit[15799]: time="2020-02-01T17:41:29-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:31 octopi4 xinit[15799]: time="2020-02-01T17:41:31-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:31 octopi4 xinit[15799]: time="2020-02-01T17:41:31-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:31 octopi4 xinit[15799]: time="2020-02-01T17:41:31-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:31 octopi4 xinit[15799]: time="2020-02-01T17:41:31-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:33 octopi4 xinit[15799]: time="2020-02-01T17:41:33-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:33 octopi4 xinit[15799]: time="2020-02-01T17:41:33-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:33 octopi4 xinit[15799]: time="2020-02-01T17:41:33-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:33 octopi4 xinit[15799]: time="2020-02-01T17:41:33-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:35 octopi4 xinit[15799]: time="2020-02-01T17:41:35-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:35 octopi4 xinit[15799]: time="2020-02-01T17:41:35-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:35 octopi4 xinit[15799]: time="2020-02-01T17:41:35-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:35 octopi4 xinit[15799]: time="2020-02-01T17:41:35-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:37 octopi4 xinit[15799]: time="2020-02-01T17:41:37-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:37 octopi4 xinit[15799]: time="2020-02-01T17:41:37-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:37 octopi4 xinit[15799]: time="2020-02-01T17:41:37-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:37 octopi4 xinit[15799]: time="2020-02-01T17:41:37-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:39 octopi4 xinit[15799]: time="2020-02-01T17:41:39-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:39 octopi4 xinit[15799]: time="2020-02-01T17:41:39-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:39 octopi4 xinit[15799]: time="2020-02-01T17:41:39-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:39 octopi4 xinit[15799]: time="2020-02-01T17:41:39-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:41 octopi4 xinit[15799]: time="2020-02-01T17:41:41-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:41 octopi4 xinit[15799]: time="2020-02-01T17:41:41-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:41 octopi4 xinit[15799]: time="2020-02-01T17:41:41-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:41 octopi4 xinit[15799]: time="2020-02-01T17:41:41-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:43 octopi4 xinit[15799]: time="2020-02-01T17:41:43-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:43 octopi4 xinit[15799]: time="2020-02-01T17:41:43-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:43 octopi4 xinit[15799]: time="2020-02-01T17:41:43-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:43 octopi4 xinit[15799]: time="2020-02-01T17:41:43-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:45 octopi4 xinit[15799]: time="2020-02-01T17:41:45-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:45 octopi4 xinit[15799]: time="2020-02-01T17:41:45-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:45 octopi4 xinit[15799]: time="2020-02-01T17:41:45-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:45 octopi4 xinit[15799]: time="2020-02-01T17:41:45-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:47 octopi4 xinit[15799]: time="2020-02-01T17:41:47-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:47 octopi4 xinit[15799]: time="2020-02-01T17:41:47-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:47 octopi4 xinit[15799]: time="2020-02-01T17:41:47-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:47 octopi4 xinit[15799]: time="2020-02-01T17:41:47-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:49 octopi4 xinit[15799]: time="2020-02-01T17:41:49-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:49 octopi4 xinit[15799]: time="2020-02-01T17:41:49-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:49 octopi4 xinit[15799]: time="2020-02-01T17:41:49-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:49 octopi4 xinit[15799]: time="2020-02-01T17:41:49-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:51 octopi4 xinit[15799]: time="2020-02-01T17:41:51-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:51 octopi4 xinit[15799]: time="2020-02-01T17:41:51-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:51 octopi4 xinit[15799]: time="2020-02-01T17:41:51-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:51 octopi4 xinit[15799]: time="2020-02-01T17:41:51-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:53 octopi4 xinit[15799]: time="2020-02-01T17:41:53-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:53 octopi4 xinit[15799]: time="2020-02-01T17:41:53-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:53 octopi4 xinit[15799]: time="2020-02-01T17:41:53-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:53 octopi4 xinit[15799]: time="2020-02-01T17:41:53-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:55 octopi4 xinit[15799]: time="2020-02-01T17:41:55-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:55 octopi4 xinit[15799]: time="2020-02-01T17:41:55-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:55 octopi4 xinit[15799]: time="2020-02-01T17:41:55-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:55 octopi4 xinit[15799]: time="2020-02-01T17:41:55-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:57 octopi4 xinit[15799]: time="2020-02-01T17:41:57-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:57 octopi4 xinit[15799]: time="2020-02-01T17:41:57-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:57 octopi4 xinit[15799]: time="2020-02-01T17:41:57-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:57 octopi4 xinit[15799]: time="2020-02-01T17:41:57-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:59 octopi4 xinit[15799]: time="2020-02-01T17:41:59-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:59 octopi4 xinit[15799]: time="2020-02-01T17:41:59-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:41:59 octopi4 xinit[15799]: time="2020-02-01T17:41:59-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:41:59 octopi4 xinit[15799]: time="2020-02-01T17:41:59-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:01 octopi4 xinit[15799]: time="2020-02-01T17:42:01-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:01 octopi4 xinit[15799]: time="2020-02-01T17:42:01-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:01 octopi4 xinit[15799]: time="2020-02-01T17:42:01-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:01 octopi4 xinit[15799]: time="2020-02-01T17:42:01-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:03 octopi4 xinit[15799]: time="2020-02-01T17:42:03-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:03 octopi4 xinit[15799]: time="2020-02-01T17:42:03-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:03 octopi4 xinit[15799]: time="2020-02-01T17:42:03-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:03 octopi4 xinit[15799]: time="2020-02-01T17:42:03-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:05 octopi4 xinit[15799]: time="2020-02-01T17:42:05-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:05 octopi4 xinit[15799]: time="2020-02-01T17:42:05-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:05 octopi4 xinit[15799]: time="2020-02-01T17:42:05-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:05 octopi4 xinit[15799]: time="2020-02-01T17:42:05-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:07 octopi4 xinit[15799]: time="2020-02-01T17:42:07-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:07 octopi4 xinit[15799]: time="2020-02-01T17:42:07-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:07 octopi4 xinit[15799]: time="2020-02-01T17:42:07-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:07 octopi4 xinit[15799]: time="2020-02-01T17:42:07-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:09 octopi4 xinit[15799]: time="2020-02-01T17:42:09-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:09 octopi4 xinit[15799]: time="2020-02-01T17:42:09-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:09 octopi4 xinit[15799]: time="2020-02-01T17:42:09-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:09 octopi4 xinit[15799]: time="2020-02-01T17:42:09-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:11 octopi4 xinit[15799]: time="2020-02-01T17:42:11-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:11 octopi4 xinit[15799]: time="2020-02-01T17:42:11-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:11 octopi4 xinit[15799]: time="2020-02-01T17:42:11-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:11 octopi4 xinit[15799]: time="2020-02-01T17:42:11-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:13 octopi4 xinit[15799]: time="2020-02-01T17:42:13-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:13 octopi4 xinit[15799]: time="2020-02-01T17:42:13-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:13 octopi4 xinit[15799]: time="2020-02-01T17:42:13-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:13 octopi4 xinit[15799]: time="2020-02-01T17:42:13-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:15 octopi4 xinit[15799]: time="2020-02-01T17:42:15-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:15 octopi4 xinit[15799]: time="2020-02-01T17:42:15-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:15 octopi4 xinit[15799]: time="2020-02-01T17:42:15-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:15 octopi4 xinit[15799]: time="2020-02-01T17:42:15-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:17 octopi4 xinit[15799]: time="2020-02-01T17:42:17-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:17 octopi4 xinit[15799]: time="2020-02-01T17:42:17-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:17 octopi4 xinit[15799]: time="2020-02-01T17:42:17-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:17 octopi4 xinit[15799]: time="2020-02-01T17:42:17-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:19 octopi4 xinit[15799]: time="2020-02-01T17:42:19-05:00" level=info msg="background task execute" file=common.go func="ui.(*BackgroundTask).loop" line=163
Feb  1 17:42:19 octopi4 xinit[15799]: time="2020-02-01T17:42:19-05:00" level=info msg="glib IdleAdd" file=common.go func="ui.(*BackgroundTask).execute" line=173
Feb  1 17:42:19 octopi4 systemd[1]: octoscreen.service: Watchdog timeout (limit 1min)!
synman commented 4 years ago

I believe what we're seeing is remnants of a bug from OctoPrint-TFT whose behavior changed with the introduction of watchdog.

https://github.com/mcuadros/OctoPrint-TFT/issues/34

This started feeling very familiar and I went back to some issues I tracked about a year ago. At the time I moved to an alternate fork of octoprint-tft and the "lockup" issue went away.

I always assumed it was because of changes to dpms / screensaver / screen blank but after doing some comparisons tonight across forks I also see some other changes I'm going to play with.

Namely:

If you follow the TFT issue, another work-around was. removing temperature reporting... but I'm not seeing those requests fail here.

If I were to go back to octoprint-tft and recreate this condition, I'm curious if I would find myself looking at glib IdleAdd(). There is a bit more tracing I can do -- all the way to glib itself but. I haven't gotten around to that yet either.

manbehindthemadness commented 4 years ago

Well as this is actually a really slick little UX app, I gotta say, I am impressed at all the reactionary effort to get it ironed out, I am currently setting up an Octoprint dev environment that (hopefully) run some concurrency diagrams against the thread handling of the plugin in pycharm, I will let you know what it turns up.

synman commented 4 years ago

I pushed to the latest version of gotk3, increased the idle/print refresh rates to 5 seconds, the ui status refresh rate to 10 seconds, and surrounded all task execution calls with if blocks to ensure no re-entrance.

Not 5 minutes into a test print I got a shutdown.

nicwilson58 commented 4 years ago

After using it several days I notice octoscreen crashes and resets when the system is busy, such as when a jon starts and data is being sent to octoprint over usb.

manbehindthemadness commented 4 years ago

Mine crashes when it goes into complex fast movements.

manbehindthemadness commented 4 years ago

Looking at this in pycharm it clear that the octoscreen plugin is just transmitting events as a member the main OctoPrint thread. Im wondering if the crash is being cause by a delay when a transmission is being sent from OctoPrint at the same time OctoScreen is requesting an update.

synman commented 4 years ago

I've ruled that out (as a direct cause). The problem occurs internal to OctoScreen and its handoff through gotk3 to gtk3's main UI thread.

Even though gtk3 is reporting the thread is attached, it never actually fires.

We're seeing the same issue that was originally reported on OctoPrint-TFT. In DarkSid3r's fork he was able to resolve the lockup but I haven't pinpointed exactly what he did differently yet.

Something else that isn't sitting well with me is that all of this network work (OctoPrint API calls) are happening in the main GTK UI thread. This breaks some rules regarding how to feed GTK but would require an additional task within each of the OctoScreen panels to fire the API calls separate from the UI updates.

Perhaps that ^^^ is what we need to settle down GTK and keep the main thread from going sideways.

synman commented 4 years ago

Some additional notes:

I moved logging all the way into the actual golang gotk3 classes and am logging the hang / handoff there. You can't get any closer to the problem.

I also upgraded golang to 1.13.7 and gotk3 to 3.24.13 as well as sourced the latest runtime version of gotk3 from its GitHub master branch.

BUSTER_IMAGE := golang:1.13.7-buster
BUSTER_GO_TAGS := gtk_3_24_13

The next thing for me is. more analysis in DarkSid3r's fork (I may drop him a line) and, if I'm bored, separating the API and UI work into separate tasks. My gut tells me the latter will solve the issue for sure.

manbehindthemadness commented 4 years ago

So I have been running some tests, I managed to (i think) stabilize the situation by altering klipper.service and adding a Nice=-20. I also removed the theme css files from the plugin to regain virtual sd functionality. It seems that klipper and OctoPrint were fighting for CPU cycles and causing connection timeouts. I think what @synman is saying is probably correct and my solution is just a general workaround. We shouldn't have to force the priority as it should be handled by the kernel.

So far I am two hours into a test print without any issues at all.

manbehindthemadness commented 4 years ago

So far so good, we are making a commit tomorrow and see how it works around the replicape community.