openwrt / packages

Community maintained packages for OpenWrt. Documentation for submitting pull requests is in CONTRIBUTING.md
GNU General Public License v2.0
3.94k stars 3.45k forks source link

nut-server: doesn't restart #6997

Closed mantinan closed 6 years ago

mantinan commented 6 years ago

Maintainer: @\cshoredaniel Environment: mips_24kc tp-link TL-MR3420 OpenWrt 18.06.1, r7258-5eb055306f

Description:

Sometimes the riello driver won't load, that is a problem for wich I don't have much data, the kernel says the cable disconnects during boot, don't know why, I believe a similar thing can also happen on other models.

The problem however is that when it fails to load you cannot do a restart, I tried /etc/init.d/nut-server restart and it doesn't restart or end the process, it stays there waiting for something that doesn't happen :-(

It sounds weird to me the processes that are left after a normal start, I always see this processes:

1402 root 976 S /usr/sbin/upsd 1417 root 1248 S /lib/nut/riello_usb -a riello 1421 root 1316 S /bin/sh /etc/rc.common /etc/init.d/nut-server running 1426 root 1200 S flock 1000

On previous versions we didn't have a shell running or a flock around, so memory would be lower, and I could do a restart of nut in case something had gone wrong with nut.

Regards.

danielfdickinson commented 6 years ago

This is fixed in master; once master is sufficiently tested hopefully my backport PR will be accepted.

mantinan commented 6 years ago

Ok, I'll try to see if I can backport it and see if it works.

On the other hand there is the failure to start of the riello driver, which I've seen several times, and I have another router connected to a couple sais that use apcsmart driver and also seem to have failed at least once, are also these kind of failures fixed on master?

If they are not fixed I'll have to investigate it.

I'd like to suggest that the apcsmart driver is also built on official openwrt, having to do it by hand is quite a lot of work and the driver works well. Is there any reason for not building it? if not... where should I ask for this?

Thanks in advance.

danielfdickinson commented 6 years ago

@mantinan For the riello, it might be sort of fixed in master (that is when it 'shows up' again in the kernel it should hotplug.

Is the apcsmart a serial driver? I don't have any serial UPSes to test and most routers aren't really prepared to deal with RS232 inputs so they're not built by default. Perhaps do a build with/without and show the size difference? If it's not too much in the common packages the I'm all for building by default.

danielfdickinson commented 6 years ago

@mantinan FYI the documentation for what's in and what's being worked on, is here: https://openwrt.org/docs/guide-user/services/ups/software.nut#nut_server

mantinan commented 6 years ago

The apcsmart is a serial driver, I also don't have a router with a serial interface but that is simply fixed by using a cheap usb to serial cable, I'm currently having two of this cables attached to a usb hub and then to a one usb port tp-link router to monitor two apc UPS using the apcsmart driver without problems.

I really believe that the serial drivers should be built, as any device with a usb port can easily manage a serial UPS.

I have the numbers for my build of the apcsmart driver, I don't think I had changed a lot of options when building this.

988 ago 27 19:05 nut_2.7.4-7_mips_24kc.ipk 16875 ago 27 19:05 nut-common_2.7.4-7_mips_24kc.ipk 31334 ago 27 19:05 nut-driver-apcsmart_2.7.4-7_mips_24kc.ipk

That's for current stable.

danielfdickinson commented 6 years ago

I think I'll enable serial by default after double checking no nasty size surprises on e.g. nut-server (not expecting any). I'll need some help making sure things work with the master initscripts and such though as I don't have any such hardware.

mantinan commented 6 years ago

Ok, it is working great on our setup right now, no problem on testing whatever you need for that build.

dibdot commented 6 years ago

Problem seems to be fixed - close this by now.

danielfdickinson commented 6 years ago

Are you using master (with working restart) or 18.06 (which doesn't have it yet)? I'd like to be sure master hasn't broken serial.

mantinan commented 6 years ago

I'm ussing 18.06 right now, but I have compiled master and I plan to test it next week, I'll report back then.

mantinan commented 5 years ago

I have installed nut from master on top of the 18.06 installation but I've found the same errors you were discussing with malakingpusa and I cannot find how you solved them.

When I boot I only see: 1494 root 2868 R /usr/sbin/upsd -D -u nut

If I then do a /etc/init.d/nut-server stop and a /etc/init.d/nut-server start I get: mkdir: can't create directory '': No such file or directory chown: : No such file or directory

and I get back to the previous only upsd running situation.

What can I do to fix this? Do I need any other component from master?

Regards.

mantinan commented 5 years ago

I solved the error by setting option statepath /var/run/nut on config upsd, but only upsd is run, no sign of the driver, in this case it was apcsmart.

I'm wondering thy did I have to set statepath on the first place :-?

And then, the problem with the driver, is specific to my driver (because it is serial) or will it happen on any other driver?

I'm still at the same place as my previous message.

danielfdickinson commented 5 years ago

Apologies everybody I've gotten drawn into local politics and it's talking up a lot of time. Election's on October 22 so if I haven't gotten to this by then (although hopefully I can make time before then) I should have time after that.

mantinan commented 5 years ago

I was testing with a stable installation and master nut packages but as I was not sure if I had to pull any other package from master, I'm now testing with a newly installed machine running master.

When running master (OpenWrt SNAPSHOT, r8165-3fa7e62) I get exactly the same results as explained before, this time I'm using driver riello_usb.

I finally got this to run with a couple of tricks, the one I commented of setting statepath, and the other was fixing the configs. Like Ivanich explains on PR #7096 driver riello_usb doesn't recognice a series of variables, so I was getting...

Fatal error: 'offdelay' is not a valid variable name for this driver. Fatal error: 'ondelay' is not a valid variable name for this driver. Fatal error: 'pollfreq' is not a valid variable name for this driver.

After commenting all those on the init.d script still riello_usb was not running, this time it was because of the permissions on the /dev/bus/usb devices, allowing nut to read and write that finally fixed the problem and it now runs ok.

I'll try to get this running into stable as well using only the nut packages from master and report back.

danielfdickinson commented 5 years ago

Thank you for the feedback. I've gotten drawn into local (municipal) politics and have have been quite busy with 'life' as well so although I will make an effort to get to this before October 22 election I may not have a lot of time before then.

mantinan commented 5 years ago

Hi again!

I tried the new packages on stable with the same results as on master, needed the ption statepath /var/run/nut to avoid the mkdir problems, the fatal errors with the 3 variables, which are not valid for riello_usb or apcsmart either, and the permissions problems.

On the riello_usb the permissions are needed on /dev/bus/usb/... while the apcsmart (which is a serial driver) needs permissions on the serial port, in my case this was a usb to serial adapter, thus /dev/ttyUSBX.

After fixing those problems everything seems to work ok.

Regards.

danielfdickinson commented 5 years ago

@mantinan FYI you can create /etc/hotplug.d/tty and add a hotplug script under it to set permissions on the serial port.

mantinan commented 5 years ago

Hi!

I have now tested it with an APC UPS running serial apcsmart with this /etc/hotplug.d/tty/99-USB script for the permissions problem and everything runs ok:

!/bin/sh

[ -n "$DEVNAME" -a -z "${DEVNAME%ttyUSB*}" -a "$ACTION" = "add" ] && chown nut "/dev/$DEVNAME"

The only problem that remains is the user problem (upsd running as root as I explained already on the wrong bug, sorry about that, and that I'm pasting again here), that was with my config for riello usb:

I have downloaded master from cshoredaniel:pr-nut-more-fixes and compiled it.

I'm finding that even though I have set it to run as nut I get upsd running as root, this used to be user nut on -8 version, so -9 has a regression here.

On -8 a ps would tell: 1172 nut 2900 S /usr/sbin/upsd -D -u nut On -9 I get: 1258 root 2896 S /usr/sbin/upsd -D

This is my nut_server config: config driver 'riello' option driver riello_usb option port /dev/ttyUSB0 config listen_address option address 172.22.54.49 option port 3493 config upsd 'upsd' option maxage 15 option statepath /var/run/nut option maxconn 1024 option runas nut

Regards.

danielfdickinson commented 5 years ago

On 2018-10-19 7:56 a.m., mantinan wrote:

Hi!

I have now tested it with an APC UPS running serial apcsmart with this /etc/hotplug.d/tty/99-USB script for the permissions problem and everything runs ok:

!/bin/sh

[ -n "$DEVNAME" -a -z "${DEVNAME%ttyUSB*}" -a "$ACTION" = "add" ] && chown nut "/dev/$DEVNAME"

Unfortunately forcing the 'nut' user to ttyUSB is probably a non-starter as it NUT is not the only consumer of serial USB. I am thinking though that it might be worthwhile to have a 'nut_server' option that such a script reads to determine whether to run or not (so it only comes into play for NUT users).

The only problem that remains is the user problem (upsd running as root as I explained already on the wrong bug, sorry about that, and that I'm pasting again here), that was with my config for riello usb:

I have downloaded master from cshoredaniel:pr-nut-more-fixes and compiled it.

I'm finding that even though I have set it to run as nut I get upsd running as root, this used to be user nut on -8 version, so -9 has a regression here.

Ah, I'll have to document that better -- runas as is now per-driver not per NUT server (so if you do runas under 'config driver 'riello'' it should work.

mantinan commented 5 years ago

Hi again!

I'd like you to note something, I was talking that the upsd was run as root, while on my config it says that upsd is run as nut. This is what I think is wrong.

On the other hand as I hadn't specified a user for the riello driver it is started as nut by default, which looks ok.

Regards.

danielfdickinson commented 5 years ago

Thanks, that should be fix the recently merged PR

mantinan commented 5 years ago

Hi!

I have upgraded to the latest snapshots available and upsd still runs as root. This happens both if you don't devine a config upsd section and if you define it specifying a runas nut.

So this doesn't seem to be fixed yet.

If you want me to test anything, just tell me.

Regards.

danielfdickinson commented 5 years ago

On 2018-11-27 7:15 a.m., mantinan wrote:

Hi!

I have upgraded to the latest snapshots available and upsd still runs as root. This happens both if you don't devine a config upsd section and if you define it specifying a runas nut.

So this doesn't seem to be fixed yet.

If you want me to test anything, just tell me.

Thank you for you report.  I forgot to push my changes for this. I'm rebasing and testing again and will push again.

Regards,

Daniel

danielfdickinson commented 5 years ago

@mantinan Ok, so the problem wasn't what I thought, but I know have a solution and will be pushing (hopefully in the AM).

danielfdickinson commented 5 years ago

@mantinan FYI I've posted PR: https://github.com/openwrt/packages/pull/7638 which ought to do must of what's now in master except the USB hotplug bits (as that's a new feature IMO). If you can, please test and comment!

mantinan commented 5 years ago

I have commented on PR: #7638 that at least for the riello this is not working like it should. Sorry about the bold fonts, those where root prompts O:-)

danielfdickinson commented 5 years ago

@mantinan quick question: is master not working as well (it does for me and another user, but that's not necessarily enough of an indicator) or did I miss something in the backport process?

danielfdickinson commented 5 years ago

@mantinan For reference I've essentially reverted the runas stuff for 18.06 because it was a new feature (for OpenWrt) that never properly worked in 18.06 series, and is non-essential. It should, however, be working in master, including with serial USB (if the appropriate option is enabled in /etc/config/nut_server).

mantinan commented 5 years ago

Hi! I have the 18.06 stable devices working as nut without problems with a -8 version that I had compiled previously, so that's ok with me.

As for master.... I've got a device installed with current snapshot but I've had a lot of trouble... 1- I could not install a thing, as standard wget was trying to go out through IPv6 which I don't have, instead of the proxy I had specified on my environment. So I had to install wget and dependencies by hand and then I could use opkg to install nut. 2- I cannot test riello driver which is the ups I have here as the usb of the Riello doesn't work ok with the tp-link Archer C7. 3- Testing the apcsmart driver I got problems to make it set the proper permissions on ttyUSB0, so the riello driver wouldn't start. This is basically because I don't know how to configure enable_usb_serial, this is my current config, if you tell me what's wrong I'll try again:

config driver 'apc1rack' option driver apcsmart option port /dev/ttyUSB0 option enable_usb_serial 1 config upsd 'upsd' option maxage 15 option statepath /var/run/nut option maxconn 1024 option runas nut option enable_usb_serial 1

adding a commented enable_usb_serial example on the default config file would be great.

Regards.

danielfdickinson commented 5 years ago

FYI I'm working in this; discovered a silly mistake off the top (missing sourcing /lib/functions.sh which means config_load was not actually available, hence no effect from enable_usb_serial).

danielfdickinson commented 5 years ago

@mantinan Finally got https://github.com/openwrt/packages/pull/8017 pushed! I hope you can check it out. I think it should be the final iteration needed.

mantinan commented 5 years ago

I've upgraded today to dhe -15 packages as well as underlaying libusb and that stuff, I still don't get ttyUSB0 with the right permissions.

Like I said on the previous post, I don't know if that config file is ok or where to put the enable_usb_serial option, that's why as you can see on that post, I have it twice.

I've just upgraded the packages related to nut and usb without changing the config file or anything else. If you think I should upgrade any other thing or change something on that config file, please let me know, right now it just launches upsd as nut but the apc driver fails to start.

Regards.

danielfdickinson commented 5 years ago

@mantinan I put a commented usage example in the config file, but that only affects new installs:

If you look at:

https://github.com/openwrt/packages/pull/8017/files#diff-b18be452e43606e5f44e51348792e9f5 option enable_usb_serial should be defined in config driver section. also in the driver 'port' must match the device - if you request 'auto' I'll have to add some logic for that. (e.g. option port /dev/ttyUSB0) HTH (it's difficult because I don't have relevant hardware to test with).