coreybutler / node-windows

Windows support for Node.JS scripts (daemons, eventlog, UAC, etc).
Other
2.83k stars 356 forks source link

Service fails to start first time on Windows Server 2016 #318

Closed stevebaxter closed 2 years ago

stevebaxter commented 2 years ago

Issue:

We have a weird problem. Our service is installed using an Innosetup installer which installs all the files and then starts the service. On Windows 10 and Windows 11 this works flawlessly - when the installer exits the service is installed and started.

On Windows Server 2016 (unfortunately our target OS), the service is installed, but isn't started. The thing is, node-windows thinks it is started. This is our install code:

const log = new EventLogger({ source: "redacated" });

const args = minimist(process.argv.slice(1), {
  // We pass on anything after -- to the service
  "--": true
});

const serviceConfig = {
  name: "redacted",
  description: "redacted",
  script: path.join(__dirname, "index.js"),
  scriptOptions: args["--"]?.join(" "),
  stopparentfirst: true
};

const service = new Service(serviceConfig);

if (args.install) {
  log.info("Installing as a service");

  service.on("install", () => {
    log.info("Starting service");
    service.start();
  });

  service.on("start", () => {
    log.info("Service is started");
  });

  service.on("error", (err) => {
    log.error(`Error installing or starting service: ${err}`);
  });

  service.install();
} else if (args.uninstall) {
  log.info("Uninstalling service");

  service.on("uninstall", () => {
    log.info("Service uninstalled successfully");
  });

  service.on("error", (err) => {
    log.error(`Error uninstalling service: ${err}`);
  });

  service.uninstall(10);
}

When this is run by the installer we get the following logged in Event Viewer:

  1. Installing as a service
  2. Starting service
  3. Service is started

However the service is not started, it is in the stopped state. If the service is started from the "Services" snap-in, it starts fine.

This isn't a disaster for us, it's just an irritation. If there is a simple explanation it would be good to fix it though.

Expected Behavior: The service is started after install.

Desktop: Windows Server 2016

coreybutler commented 2 years ago

First, consult the wiki: https://github.com/coreybutler/node-windows/wiki#my-script-installs-but-wont-start-with-no-error

If that doesn't help, there are a few places to look for additional logs. First, check the daemon directory where node-windows creates the exe file. There will likely be an error log there with further detail. Alternatively, the Windows Event Viewer may have additional information.

Aside from the path issues described in the wiki, permissions are often a culprit. Sometimes the account used to install the service has different permissions than the account used to run the service. Sometimes accounts don't have access to read/write to a specific directory. Sometimes it's a Node version issue (i.e. stricter security restrictions in a minor release). It's unclear whether any of these permission pitfalls fit this situation, but I'd guess the __dirname reference is more likely to be the culprit.

stevebaxter commented 2 years ago

Thanks so much for the fast reply!

So __dirname is definitely an absolute path (I've logged it to check). Nothing is logged in daemon/, the only files there are the .exe, .config and .xml.

In Event Viewer I get (in reverse order):

<redacted> is started
Starting <redacted>
Installing <redacted> as a service
Script path is C:\Program Files (x86)\<redacted>\dist\index.js

This is all my logging. Nothing is logged by winsw or the wrapper (as would normally happen on install).

So it looks like NET START is succeeding (https://github.com/coreybutler/node-windows/blob/master/lib/daemon.js#L674), but actually it isn't.

I'm suspecting some account or permissions problem. I'll add some more logging to see exactly what account the installer is running my service start script under.

Update: The installer is definitely running the startup script under an admin account (isAdminUser() is returning true).

stevebaxter commented 2 years ago

So I have got to the bottom of the problem. The Windows Server 2016 instance is running as a VM on a fairly slow private cloud - the issue is that the delay here just isn't long enough. If I sleep for 20 seconds rather than 2 then the service starts normally.

I'm guessing that we probably need to increase that value, or find a way of detecting that the service install process has really finished?

stevebaxter commented 2 years ago

For anyone else that has the same problem, this is what I did to work around it:

const service = new Service(serviceConfig);

if (args.install) {
  log.info("Installing service");

  service.on("install", () => {
    log.info("Installed service");

    // On slow machines, we can get the install event before install has really finished,
    // this prevents our service from actually starting. Sleep for a while before trying
    // to start (see https://github.com/coreybutler/node-windows/issues/318#issuecomment-1232801990
    // for more information).
    setTimeout(() => {
      log.info("Starting service");
      service.start();
    }, 10000);
  });

  service.on("start", () => {
    log.info("Service is started");
  });

  service.on("error", (err) => {
    log.error(`Error installing or starting service: ${err}`);
  });

  service.install();
} else if (args.uninstall) {
  log.info("Uninstalling service");

  service.on("uninstall", () => {
    log.info("Service uninstalled successfully");
  });

  service.on("error", (err) => {
    log.error(`Error uninstalling service: ${err}`);
  });

  service.uninstall(10);
}
coreybutler commented 2 years ago

Since this has been resolved, I'm closing. If anyone else runs into this issue, feel free to open a PR. It would make sense to have a configurable sleep time defaulting to 2 seconds.