wouterdebie / locast2tuner

Locast to Emby/Plex/Channels server
https://wouterdebie.github.io/locast2tuner/
MIT License
52 stars 8 forks source link

Add some retry behavior at startup.. #32

Closed kevtown closed 3 years ago

kevtown commented 3 years ago

Service crashes when starting on boot. Apparently attempting to connect before the network is ready. Service starts successfully when started manually. Adding ExecStartPre=/bin/sleep 10 to locast2tuner.service seems to function as a workaround

● locast2tuner.service - locast2tuner Loaded: loaded (/lib/systemd/system/locast2tuner.service; enabled; vendor preset: enabled) Active: failed (Result: signal) since Mon 2021-04-12 10:15:05 EDT; 8min ago Process: 372 ExecStart=/usr/bin/locast2tuner -c /etc/locast2tuner/config.ini --cache_dir /var/lib/locast2tuner (code=killed, signal=ABRT) Main PID: 372 (code=killed, signal=ABRT)

Apr 12 10:15:00 raspberrypi systemd[1]: Started locast2tuner. Apr 12 10:15:00 raspberrypi locast2tuner[372]: config selected = /etc/locast2tuner/config.ini Apr 12 10:15:00 raspberrypi locast2tuner[372]: [INFO] - locast2tuner 0.1.10 on Linux 5.10.17-v7l+ starting.. Apr 12 10:15:00 raspberrypi locast2tuner[372]: [INFO] - UUID: Apr 12 10:15:00 raspberrypi locast2tuner[372]: [INFO] - Logging in with Apr 12 10:15:05 raspberrypi locast2tuner[372]: thread 'main' panicked at 'called Result::unwrap() on an Err value: reqwest::Error { kind: Request, url: Url { scheme: "https", username: "", password: None, host: Some(Domain("api.locastnet.org")), port: None, path: "/api/user/login", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("dns error", Custom { kind: Other, error: "failed to lookup address information: Try again" })) }', src/credentials.rs:73:10 Apr 12 10:15:05 raspberrypi locast2tuner[372]: note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Apr 12 10:15:05 raspberrypi systemd[1]: locast2tuner.service: Main process exited, code=killed, status=6/ABRT Apr 12 10:15:05 raspberrypi systemd[1]: locast2tuner.service: Failed with result 'signal'.

eiddor commented 3 years ago

@kevtown Can you try binding to 0.0.0.0 if you're not already? Not necessarily a workaround, but I'm curious if it fixes it.

kevtown commented 3 years ago

added bind_address = "0.0.0.0" to /etc/locast2tuner/config.ini

removed ExecStartPre=/bin/sleep 10 from locast2tuner.service

no change that I can see.

● locast2tuner.service - locast2tuner Loaded: loaded (/lib/systemd/system/locast2tuner.service; enabled; vendor preset: enabled) Active: failed (Result: signal) since Mon 2021-04-12 12:08:19 EDT; 14min ago Process: 370 ExecStart=/usr/bin/locast2tuner -c /etc/locast2tuner/config.ini --cache_dir /var/lib/locast2tuner (code=killed, signal=ABRT) Main PID: 370 (code=killed, signal=ABRT)

Apr 12 12:08:14 raspberrypi systemd[1]: Started locast2tuner. Apr 12 12:08:14 raspberrypi locast2tuner[370]: config selected = /etc/locast2tuner/config.ini Apr 12 12:08:14 raspberrypi locast2tuner[370]: [INFO] - locast2tuner 0.1.10 on Linux 5.10.17-v7l+ starting.. Apr 12 12:08:14 raspberrypi locast2tuner[370]: [INFO] - UUID: Apr 12 12:08:14 raspberrypi locast2tuner[370]: [INFO] - Logging in with Apr 12 12:08:19 raspberrypi locast2tuner[370]: thread 'main' panicked at 'called Result::unwrap() on an Err value: reqwest::Error { kind: Request, url: Url { scheme: "https", username: "", password: None, host: Some(Domain("api.locastnet.org")), port: None, path: "/api/user/login", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("dns error", Custom { kind: Other, error: "failed to lookup address information: Try again" })) }', src/credentials.rs:73:10 Apr 12 12:08:19 raspberrypi locast2tuner[370]: note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Apr 12 12:08:19 raspberrypi systemd[1]: locast2tuner.service: Main process exited, code=killed, status=6/ABRT Apr 12 12:08:19 raspberrypi systemd[1]: locast2tuner.service: Failed with result 'signal'.

eiddor commented 3 years ago

Oh, I see - Ok, so it's not a binding issue. locast2tuner really really wants to contact Locast right away and it can't even resolve it because the network is down and there's no retry. Would probably be solved by a little #15 :-)

I guess a good test internally would be to sleep/retry a few times if DNS won't resolve.

kevtown commented 3 years ago

Sounds like a great idea. Thanks for your work on this.

wouterdebie commented 3 years ago

Good point.. I could add some retries here and there. But then again, why doesn't your DNS resolve?

eiddor commented 3 years ago

It's not necessarily indicative of a problem. The RPi can be slow to bring up networking, especially if you have spanning-tree enabled on the switch port.

I don't think I've done restarts when I've done testing here, so I will probably see the same thing at some point.

(Great work, btw - this has been a fun project and I am learning a ton of stuff that I never get to work with.).

wouterdebie commented 3 years ago

Oh I see! In that case we could actually rely on systemd. @eiddor or @kevtown, since it seems you can replicate this behavior, could you try adding the following to locast2tuner.service?

After=network-online.target
Wants=network-online.target

(found this at https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ and there's some great explanation)

eiddor commented 3 years ago

Oh I see! In that case we could actually rely on systemd. @eiddor or @kevtown, since it seems you can replicate this behavior, could you try adding the following to locast2tuner.service?

Will do - currently fighting with my NAS, so everything is down, but I'll test it sometime tomorrow.

Note: We may still need some retry logic for the Docker container, since it's not started by systemd, and Docker may bring up networking before the host network is up. I'll test that out here, too.

eiddor commented 3 years ago

(Upon further consideration, depending on your restart setting, Docker itself may restart the container if it fails.)

wouterdebie commented 3 years ago

Yeah, but @kevtown wasn't talking about Docker. Our docker image doesn't use systemd

eiddor commented 3 years ago

I know - What I'm saying is that having a network delay may also cause the Docker container to fail and we can't fix that one with systemd commands. I'll have to check that out tomorrow.

wouterdebie commented 3 years ago

Ah gotcha!

wouterdebie commented 3 years ago

I guess that depends on how you run Docker. Don't you generally want docker to run when network is ready?

eiddor commented 3 years ago

I should have checked before saying anything :-)

Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket containerd.service

We'll blame the NAS issues.

wouterdebie commented 3 years ago

Hahah! And Docker also uses the After and Wants that I mentioned earlier. Seems like the way to go :)

eiddor commented 3 years ago

Yeah, great find!

wouterdebie commented 3 years ago

@kevtown I thought you added a comment here, but did you remove it? What happened? @eiddor are you still going to test out the change? It seems we would need this in the [Unit] section, right?

After=network-online.target
Wants=network-online.target
eiddor commented 3 years ago

@kevtown I thought you added a comment here, but did you remove it? What happened?

I'm glad it wasn't just me!

Yeah, those lines it would go under [Unit] in /etc/systemd/system/multi-user.target.wants/locast2tuner.service

I just RMA'd my NAS, but I will cobble something together to test with where I have some ability to delay network start. Should be able to get that done tonight.

eiddor commented 3 years ago

So, that workaround works fine on Ubuntu running in a VM, but not on a Raspberry Pi running Raspbian.

Apr 14 04:37:28 raspberrypi systemd[1]: Started locast2tuner.
Apr 14 04:37:28 raspberrypi locast2tuner[372]: config selected = /etc/locast2tuner/config
Apr 14 04:37:28 raspberrypi locast2tuner[372]: [INFO] - locast2tuner 0.1.15 on Linux 5.10.17-v7l+ starting..
Apr 14 04:37:28 raspberrypi locast2tuner[372]: [INFO] - UUID: <redacted>
Apr 14 04:37:28 raspberrypi locast2tuner[372]: [INFO] - Logging in with me@address.com
Apr 14 04:37:33 raspberrypi locast2tuner[372]: thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: reqwest::Error { kind: Request, url: Url { scheme: "https", username: "", password: None, host: Some(Domain("api.locastnet.org")), port: None, path: "/api/user/login", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("dns error", Custom { kind: Other, error: "failed to lookup address information: Try again" })) }', src/credentials.rs:73:10
Apr 14 04:37:33 raspberrypi locast2tuner[372]: note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Apr 14 04:37:33 raspberrypi systemd[1]: locast2tuner.service: Main process exited, code=killed, status=6/ABRT
Apr 14 04:37:33 raspberrypi systemd[1]: locast2tuner.service: Failed with result 'signal'.

I can start it manually with systemctl start locast2tuner after I can get into it.

Every bit of information that I can find about this with RPi includes a script that does an Internet connectivity test before launching the service. Now that we've got users running this as a service on startup, we probably will need some kind of retry or test during startup.

Also note that there's a difference (even if it's small) between the network being reachable and the Internet being reachable.

kevtown commented 3 years ago

@kevtown I thought you added a comment here, but did you remove it? What happened? @eiddor are you still going to test out the change? It seems we would need this in the [Unit] section, right?

After=network-online.target
Wants=network-online.target

I'm sorry. I did try to send you further test output before heading out to work (which keeps me AFK) but I accidentally left private info in my output so I had to delete my comment here it is with the private info stripped.

locast2tuner.service [Unit] Description=locast2tuner After=online.target [Service] User=locast2tuner Group=locast2tuner

ExecStartPre=/bin/sleep 10

ExecStart=/usr/bin/locast2tuner -c /etc/locast2tuner/config.ini --cache_dir /var/lib/locast2tuner

[Install] WantedBy=multi-user.target online.target

service status output on reboot ● locast2tuner.service - locast2tuner Loaded: loaded (/lib/systemd/system/locast2tuner.service; enabled; vendor preset: enabled) Active: failed (Result: signal) since Tue 2021-04-13 05:23:54 EDT; 3min 15s ago Process: 372 ExecStart=/usr/bin/locast2tuner -c /etc/locast2tuner/config.ini --cache_dir /var/lib/locast2tuner (code=killed, signal=ABRT) Main PID: 372 (code=killed, signal=ABRT)

Apr 13 05:23:48 raspberrypi systemd[1]: Started locast2tuner. Apr 13 05:23:48 raspberrypi locast2tuner[372]: config selected = /etc/locast2tuner/config.ini Apr 13 05:23:48 raspberrypi locast2tuner[372]: [INFO] - locast2tuner 0.1.10 on Linux 5.10.17-v7l+ starting.. Apr 13 05:23:48 raspberrypi locast2tuner[372]: [INFO] - UUID: Apr 13 05:23:48 raspberrypi locast2tuner[372]: [INFO] - Logging in with Apr 13 05:23:54 raspberrypi locast2tuner[372]: thread 'main' panicked at 'called Result::unwrap() on an Err value: reqwest::Error { kind: Request, url: Url { scheme: "https", username: "", password: None, host: Some(Domain("api.locastnet.org")), port: None, path: "/api/user/login", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("dns error", Custom { kind: Other, error: "failed to lookup address information: Try again" })) }', src/credentials.rs:73:10 Apr 13 05:23:54 raspberrypi locast2tuner[372]: note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Apr 13 05:23:54 raspberrypi systemd[1]: locast2tuner.service: Main process exited, code=killed, status=6/ABRT Apr 13 05:23:54 raspberrypi systemd[1]: locast2tuner.service: Failed with result 'signal'.

Looks same as before

eiddor commented 3 years ago

@kevtown Try adding both commands under [Unit]

After=network-online.target
Wants=network-online.target

As I mentioned above, I got this to work with Ubuntu, but not on RPi.

kevtown commented 3 years ago

Same result

[Unit] Description=locast2tuner After=network-online.target Wants=network-online.target [Service] User=locast2tuner Group=locast2tuner

ExecStartPre=/bin/sleep 10

ExecStart=/usr/bin/locast2tuner -c /etc/locast2tuner/config.ini --cache_dir /var/lib/locast2tuner

[Install] WantedBy=multi-user.target

● locast2tuner.service - locast2tuner Loaded: loaded (/lib/systemd/system/locast2tuner.service; enabled; vendor preset: enabled) Active: failed (Result: signal) since Wed 2021-04-14 11:34:53 EDT; 52s ago Process: 477 ExecStart=/usr/bin/locast2tuner -c /etc/locast2tuner/config.ini --cache_dir /var/lib/locast2tuner (code=killed, signal=ABRT) Main PID: 477 (code=killed, signal=ABRT)

Apr 14 11:34:48 raspberrypi systemd[1]: Started locast2tuner. Apr 14 11:34:48 raspberrypi locast2tuner[477]: config selected = /etc/locast2tuner/config.ini Apr 14 11:34:48 raspberrypi locast2tuner[477]: [INFO] - locast2tuner 0.1.10 on Linux 5.10.17-v7l+ starting.. Apr 14 11:34:48 raspberrypi locast2tuner[477]: [INFO] - UUID: Apr 14 11:34:48 raspberrypi locast2tuner[477]: [INFO] - Logging in with Apr 14 11:34:53 raspberrypi locast2tuner[477]: thread 'main' panicked at 'called Result::unwrap() on an Err value: reqwest::Error { kind: Request, url: Url { scheme: "https", username: "", password: None, host: Some(Domain("api.locastnet.org")), port: None, path: "/api/user/login", query: None, fragment: None }, source: hyper::Error(Connect, ConnectError("dns error", Custom { kind: Other, error: "failed to lookup address information: Try again" })) }', src/credentials.rs:73:10 Apr 14 11:34:53 raspberrypi locast2tuner[477]: note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Apr 14 11:34:53 raspberrypi systemd[1]: locast2tuner.service: Main process exited, code=killed, status=6/ABRT Apr 14 11:34:53 raspberrypi systemd[1]: locast2tuner.service: Failed with result 'signal'.

eiddor commented 3 years ago

Yeah, that's what I'm seeing on my RPi, as well. There are lots of workarounds for this until we can get some waits or retries added to the code, but for now just try to uncomment:

ExecStartPre=/bin/sleep 10

That will simply delay the service 10 seconds before starting. I just tested it on my RPi.

kevtown commented 3 years ago

Will do, thanks, FYI also experimented with After=systemd-resolved.service Wants=systemd-resolved.service

Same result

eiddor commented 3 years ago

Yeah - The RPi thinks that it is up, but apparently it's still not able to actually connect to the Internet at that point, so we're failing there. Interesting.

wouterdebie commented 3 years ago

@kevtown 0.1.17 adds retry behavior, so this should mitigate the issue on the RPi. I'll soon push 0.1.18 that will include the systemd service file changes that are discussed in this issue as well.

eiddor commented 3 years ago

Tested and the combination of the After/Want and the retries works well on RPi!