chrisa / node-dtrace-provider

Native DTrace probes for node.js apps
Other
319 stars 70 forks source link

Improving compilation failure behaviour #96

Closed melloc closed 6 years ago

melloc commented 7 years ago

It seems that there are two primary goals that we want to accomplish:

(a) someone who wants the module available should know when it failed (b) someone who doesn't care about the module shouldn't have to worry about it

Currently, by printing out to stderr, node-dtrace-provider attempts to provide a compromise between the goals, but it's not a particularly great one. Specifically:

It seems that there are several alternatives that we should consider. They're all somewhat gross, and none are perfect, but I'm thinking that number 2, or maybe a combination of some of them is the best option. The alternatives are:

  1. Attempt to compile a test program

    node-dtrace-provider could first try to compile a simple test program on platforms it supports. If the test program successfully compiles then we attempt to build the node addon. If that fails, then the package install hard fails.

    This option assumes that users who don't have a working compiler toolchain don't care about building node-dtrace-provider. If you don't have a compiler installed, or you haven't agreed to the XCode license, you probably don't care.

    This has the drawback that sometimes after you upgrade Mac OS X, you need to agree to the license again, and probably want things to fail so that you'll notice that you need to re-agree. It also means that if you're on FreeBSD and haven't set up /usr/src the install will fail. If you don't have /usr/src installed, it might be because you don't care about building something like node-dtrace-provider.

  2. Environment variable

    node-dtrace-provider could check for an environment variable, REQUIRE_NODE_DTP or similar, during builds and requires. If present, it'll fail the build or the require. If not present, it'll fall back to the stub provider and not print anything. Build environments could then set this to detect build failures.

    This would mean that a user who doesn't care about node-dtrace-provider won't see messages from it on startup. It has the drawback that people who do want it to hard fail will need to put the environment variable into their environment, and is something of a "toggle this switch to make it work right" fix.

  3. Pass an option when constructing providers

    node-dtrace-provider could accept an option to createDTraceProvider() that would request the function to throw if building the addon failed.

    This would require libraries that use node-dtrace-provider to determine whether the consumer cares about DTrace probes, either by making the decision for them, or exposing an option that allows them to say whether they care. For deeply nested dependencies, trying to control this would require all intermediate libraries to expose a switch for an end user to control it. The end result would probably be that most of the time it would be set to off, obviating goal (a).

    This method would also mean that failures wouldn't happen immediately, and might come much later when something finally tries to use a failed build of node-dtrace-provider.

  4. Always fail install when build fails

    node-dtrace-provider could require that the build always succeeds on platforms with DTrace. Given that DTrace probes aren't actually required for program correctness but are there to provide additional observability into the program, this option isn't really nice to users who want to use a library that uses node-dtrace-provider, and don't care about whether DTrace probes are present, failing goal (b). This is then likely to result in libraries making node-dtrace-provideran optional dependency, which then means that compilation failures aren't obvious, failing goal (a).

Let me know if I've missed any pros/cons, or other alternative solutions.

/cc @trentm @davepacheco @jclulow

davepacheco commented 7 years ago

That's a great summary of the discussion we had. To elaborate on the two use cases that come up a lot, I think they're:

  1. A user uses a package that implicitly uses dtrace-provider, and wants the DTrace support, and wants it to fail at build time if it can't build the module and at runtime if it can't load the module. I think that's most DTrace-and-Node.js-using engineers.

  2. A user attempts to use that same package on a system where DTrace should work, but they may not have any of the setup required to actually build the module (e.g., no compiler). They want the install to work, and the program to work, likely without any scary messages (though possibly with a warning indicating that this functionality is missing).

Because the top-level command in both cases is likely npm install [-g] PACKAGE, there's not much room for the user to express their intent. That's how we got to the environment variable idea. I would prefer if case (1) were the default because you shouldn't have to opt into DTrace on a system that supports it, but I'm afraid by construction of use case (2) that we can't assume the user is savvy enough to have set an environment variable to disable the hard fail.

I think what I'm coming to is something like:

DTRACE_PROVIDER_REQUIRE = hard | best-effort

If the value is hard, then failure to build or import is always a failure. If the value is best-effort (the default), then we try to build the provider, but don't fail if we can't. And we try to import that module at runtime, but we don't fail if we can't. We could also support a skip option that would not even try.

I'll be interested to hear others' feedback on the broad plan.

trentm commented 7 years ago

Thanks for writing this up! Falling back to an envvar is painful :) but I don't have a better alternative, so I'm tending to agree with the use of an envvar (#2), plus perhaps explicit options to createDTraceProvider (#3).

Some details (partially to make sure we are on the same page):

ErisDS commented 6 years ago

Is there consensus here, and can we move forward with an implementation? An ENV VAR is an acceptable solution IMO.

I would also put forward that this message should not be printed except in development mode? Or at least not if the NODE_ENV is testing.

melloc commented 6 years ago

I refactored the build system and binding import to honor NODE_DTRACE_PROVIDER_REQUIRE=hard. To test out my changes, I have:

For testing the NODE_DTRACE_PROVIDER_REQUIRE behaviour:

melloc commented 6 years ago

I've also added a NODE_DTRACE_PROVIDER_REQUIRE=skip mode to skip attempting to require the binding.

melloc commented 6 years ago

PR at #112.

melloc commented 6 years ago

Merged in 1474252.

jingtaotan commented 6 years ago

Has node-gyp always been a requirement for building (npm install dtrace-provider) on Windows?

In 0.8.5 it didn't seem like it was, the platform detection was done in scripts/install.js, and the script returned early as soon as it detects Windows and does not try to execute node-gyp.

Our app uses restify which in turn uses dtrace-provider ~0.8. Since node-gyp wasn't actually used prior to 0.8.6 on Windows, our Windows build systems were not set up with Python etc needed by node-gyp. And as soon as 0.8.6 was released, our builds broke because NPM automatically took 0.8.6 to satisfy restify's ~0.8 dependency.

Solving this was not too difficult (just had to add an explicit dependency to dtrace-provider@0.8.5 in our app's package.json), just wanted to mention the implication of not bumping a minor version due to this change.

melloc commented 6 years ago

@jingtaotan Sorry for the trouble. Did the build break or did it just start emitting warnings? I made the "install" script:

node-gyp rebuild || node suppress-error.js

And in my Windows testing found that this would produce warnings about missing node-gyp requirements, but would exit 0 and allow the install to succeed.

melloc commented 6 years ago

I just tried again, from both Powershell and cmd.exe, and while node-gyp prints out an error message, the installation still succeeds and $LASTEXITCODE and %errorlevel% are 0.

dm17 commented 3 years ago

@jingtaotan Sorry for the trouble. Did the build break or did it just start emitting warnings? I made the "install" script:

node-gyp rebuild || node suppress-error.js

And in my Windows testing found that this would produce warnings about missing node-gyp requirements, but would exit 0 and allow the install to succeed.

Same - I'm on Linux though... So many hundreds of nasty confusing threads about node-gyp - why can't it be banished already? Is there a dtrace-provider alternative or way to use it without node-gyp gypping us?