Open remyroy opened 3 years ago
NSSM source code can be found on https://git.nssm.cc/nssm/nssm if that can help.
I've experimented with NSSM and I think it's a file permissions issue. I couldn't recreate the exact hang that you got, but I did get this error when I tried starting after importing the key not as the administrator:
$ nssm start lighthousevalidator
lighthousevalidator: Unexpected status SERVICE_STOPPED in response to START control.
I think I also got a similar error importing the key as admin before starting the service (will have to recheck this on Monday). The flow that definitely worked was:
Let me know if that works for you.
The only suspect thing I found in our code was that we call Path::exists
, which masks permissions errors. I'll switch it to using Path::metadata
so that the permissions error surfaces in open_or_create
.
For reference: https://doc.rust-lang.org/std/path/struct.Path.html#method.metadata
It really seems like a file permission issue. For some reason the permissions on C:\ethereum\var\lib\lighthouse\validator\validators\slashing_protection.sqlite
were not what I expected:
icacls.exe C:\ethereum\var\lib\lighthouse\validator\validators\slashing_protection.sqlite
C:\ethereum\var\lib\lighthouse\validator\validators\slashing_protection.sqlite OWNER RIGHTS:(R,W,D,WDAC,WO)
By adding the SYSTEM account (the account under which services normally run) with full control on the slashing_protection.sqlite
file, I got the NSSM service to start correctly.
I just tested importing my validator keystore again with lighthouse.exe account_manager validator import --network prater --datadir C:\ethereum\var\lib\lighthouse\validator --directory validator_keys
as a normal user and it is that process who creates the slashing_protection.sqlite
file with these unexpected permissions that makes the vc blocs if you running it under a different account than the one who called the account_manager validator import
command.
It would be nice if the VC would error out with a message instead of blocking if it does not have permission to access the slashing_protection.sqlite
file on Windows. I'm not sure the default OWNER RIGHTS
permission on the slashing_protection.sqlite
file is needed. I think those permissions could be relaxed.
It would be nice if the VC would error out with a message instead of blocking if it does not have permission to access the slashing_protection.sqlite file on Windows.
It looks like we have a solution to this over in https://github.com/sigp/lighthouse/pull/2436 :tada:
Seems great! Don't forget to relax the permissions on the slashing_protection.sqlite
file. There is no need for them to be as restricted as they are now when created on Windows.
Yeah, let's leave this issue open as a way to track the permissions changes for Windows
I'll add that even the logger sets the log files a owner-only
This is quite annoying if you want to run lighthouse as a service, and then use another use to look at the logs.
Description
VC blocks on
SlashingDatabase::open
when running with NSSM as a service on Windows. It does not run properly and it cannot attest. It does not leaveSlashingDatabase::open
.Version
Lighthouse v1.4.0-rc.0-f6280aa BLS Library: blst Specs: mainnet (true), minimal (false), v0.12.3 (false)
Unstable Windows 10 (10.0.19043 Build 19043) rustc 1.52.1 (9bc8c42bb 2021-05-09) commit f6280aa66308bbec590f3c1f2857f46d79b4af94 Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30037 for x64 NSSM 2.24-101-g897c7ad 64-bit 2017-04-26
Present Behaviour
When running with NSSM as a service on Windows, VC starts, displays the follow logs:
and stops/blocks. When debugging a little further, it blocks when entering
SlashingDatabase::open
.Expected Behaviour
VC should run fine even under NSSM as a service on Windows just like it does when it does not run under NSSM.
Steps to reproduce
⊞ Win
+R
, typepowershell
, pressCtrl
+⇧ Shift
+↵ Enter
and clickYes
at the User Account Control window)↵ Enter
:⊞ Win
+R
, typecmd
, pressCtrl
+⇧ Shift
+↵ Enter
and clickYes
at the User Account Control window)↵ Enter
at the end of the line):Y
and press↵ Enter
to run it.⊞ Win
+R
, typecmd
, press↵ Enter
).↵ Enter
at the end of the line):http://localhost:5051
↵ Enter
at the end of the line):C:\ethereum\var\log\lighthousevalidator-service-stderr.log
to find out that the last log message is something like: