pact-foundation / pact-js-core

Core binaries for pact-js, a Contract Testing Framework. NOTE: If you are looking to do Pact contract testing in node, you almost certainly want pact-js, not pact-node.
https://docs.pact.io
MIT License
150 stars 80 forks source link

Behind a firewall pact-node fails to install #88

Closed vrish88 closed 6 years ago

vrish88 commented 6 years ago

We have a CI environment behind a firewall, so we fail to install pact-node because it tries to download the server binaries from github.

Is there a way to handle an install behind a firewall? I can think of a couple ways we could get around this. There may be a way to specify a location to download the binaries from (like an on-prem artifactory server or something). Alternatively, we could download the binaries and put them into our codebase and (somehow) tell pact-node where they are on an install.

mboudreau commented 6 years ago

Well, first off, I would say that this issue wouldn't be only for pact, but any node modules as they would have to be downloaded from npm regardless.

The best option for you, to reduce maintenance costs, would be to whitelist github releases domain on your network.

The other option would be to do npm install @pact-foundation/pact-node on your work computer, then commit the resulting directory node_modules/@pact-foundation/pact-node. However, this is going to be a pain in the neck moving forward of you want to update the version and have to manually change it, update and commit the resulting files.

I would argue that this workaround makes the CI firewall completely useless since you can still have 3rd party binaries in git, circumventing any security the firewall provides. I would try to change that policy internally since this isn't conducive for quick innovation.

On Wed., 25 Apr. 2018, 1:29 am Michael Lavrisha, notifications@github.com wrote:

We have a CI environment behind a firewall, so we fail to install pact-node because it tries to download the server binaries from github https://github.com/pact-foundation/pact-node/blob/master/standalone/install.ts#L124 .

Is there a way to handle an install behind a firewall? I can think of a couple ways we could get around this. There may be a way to specify a location to download the binaries from (like an on-prem artifactory server or something). Alternatively, we could download the binaries and put them into our codebase and (somehow) tell pact-node where they are on an install.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pact-foundation/pact-node/issues/88, or mute the thread https://github.com/notifications/unsubscribe-auth/AAjA5AUK6PpKnBDOwKO-Vc20fG-eJfu1ks5tr0TigaJpZM4Th6Pc .

mefellows commented 6 years ago

What we're doing is pretty non-standard though Michel, so I can see why this is coming up. It's not the first time time too (see pact-foundation/pact-js/issues/168). Also see this discussion for some debugging ideas.

Is it that the GitHub URL isn't downloadable (can you download that URL manually yourself?). It might just he that certain environment vars need to be exposed to the process and we need to update our docs to help people.

Behind corporate firewalls, you typically need to either a) go out via an outbound proxy (transparent or explicit) or b) hosts + ports are blocked. It seems unlikely that the latter is blocked, as GitHub is pretty crucial to most workplaces these days.

So it's possible it's just a proxy setting that is required to be passed in. i.e. if you can download the file manually, then pact-node should be able to. If it can't, then perhaps pact-node just needs to be made proxy aware.

E.g. setting the variants of HTTPS_PROXY should be all that is required, on the assumption that (a) is the issue.

vrish88 commented 6 years ago

@mboudreau right now we've got holes poked in the firewall for npm/yarn repositories so we are able to download js dependencies. If we had control over the firewall we totally would whitelist the github releases domain .

We are able to download the npm dependencies just fine in our CI env. We can download the resources from github on our local development machines. Unfortunately it's our CI environment that blocks github. For our current situation we have the ability to upload resources elsewhere. So it would be janky and manual, but we could download them locally, and then upload them somewhere our CI could get them.

To enable that it could be as simple as replacing https://github.com/pact-foundation/pact-node/blob/357d615f687708f55ba57d71dd4f0981d367ca81/standalone/install.ts#L123-L130 with something like

const url = process.env['PACT_STANDALONE_BINARIES_URL'] || `https://github.com/pact-foundation/pact-ruby-standalone/releases/download/v${PACT_STANDALONE_VERSION}/${binary}`
return Promise.resolve({
    url: url,
    filepath: path.resolve(__dirname, binary),
    platform: platform,
    arch: arch,
    isWindows: "win32" === platform,
    platformFolderPath: path.resolve(__dirname, getPlatformFolderName(platform, arch))
});

Thoughts?

mboudreau commented 6 years ago

Sorry for the delayed comment. Can't say I'm a fan of using an environment variable that anyone can hijack to download, extract and use a binary from an unknown source without the user's knowledge; it's a massive security risk and I'm not going to risk our user's security just for a firewall workaround for a singular situation.

I frankly can't think of a plausible solution right now that will work for your situation. I mean, I would personally question the need to lock down a CI environment that strictly since any security issues that would arise from allowing the github.com domain will be the same by allowing npmjs.org, since they're both repositories that can contain binaries, and unless they look at every single dependency/binary being used in the project, it will not stop any malicious code. This firewall rule is simply adding more work on us instead of actually solving a security issue, and I would postulate is actively decreasing productivity without increasing any security value.

@mefellows we're not the only one that does something similar to this, however, most also don't have an OS/architecture specific binary they need to download. If we are to fix this for this case, we'd need to upload the binary for each OS to NPM like we were doing before, just for this one edge case, which seems like a lot of trouble for no benefit.

mefellows commented 6 years ago

@vrish88 I'm hoping we can find you a solution, but I suspect we'll need to find another way. As per before, can you please help us confirm about how GitHub is blocked? I would suggest talking with your security people, it seems more likely that GitHub isn't actually blocked, it just requires access through a proxy. If that's the case, we can help you as there are standard patterns to make this work.

If you could let us know that would be much appreciated.

uldissturms commented 6 years ago

Facing the same issue. Env variables is a sane solution IMO, hence PR #89. @mboudreau - what other options do we have for people that would love to host dependencies themselves? NPM is not the only source for packages - some setups have their internal NPM and artifact registries.

mboudreau commented 6 years ago

@uldissturms it's not sane to create a security hole for all users from a security standpoint, which is kind of the whole reason for this thread in the first place.

Best option if you're going to have arbitrary firewall rules, is to install pact-node locally then install all binaries by doing node node_modules/@pact-foundation/pact-node/postinstall.js darwin; node node_modules/@pact-foundation/pact-node/postinstall.js win32; node node_modules/@pact-foundation/pact-node/postinstall.js linux ia32; node node_modules/@pact-foundation/pact-node/postinstall.js linux x64, and finally, committing the everything into source control. When your CI runs, it will figure out which binary to use appropriately. This is a much more secure way of handling your binaries instead of creating a security hole for all our users.

If your firewall also blocks your development computer as such that you cannot even install the binaries, well, your security practices are problematic for getting your job done.

uldissturms commented 6 years ago

This will not work in the current setup we've got unfortunately - building node modules solely from package.json and lock files in a separate docker container and sharing volume from that container later on. Everything except those two JSON files is ignored.

Couple options we're considering at the moment - all ugly:

Some libraries (nodemon, standard, babel, etc.) provide package config section. What do you think of this option?

mefellows commented 6 years ago

@uldissturms do you mean something like this or this?

So for Pact, it might look like allowing users to configure it with:

package.json

{
  ...
  "pact-node": {
      "standaloneBinary": "/path/to/local/distribution"
  }
}

I don't see a problem with this security-wise (if you can't control your running environment then you have all kinds of other security issues), but I do see an issue with pact-node drifting from local copies of the standalone CLI tools. Right now, we are guaranteed to keep them in sync as we control the distribution. We'll need to implement some logic (as we do in pact-go) to detect the current expected versions of the CLI and what is currently available on the path, and warn/fail if they get out of sync due to undefined behaviour.

If we can deal with the drift issue, I would be happy to accept a PR that takes the configuration from an explicit package.json directive.

uldissturms commented 6 years ago

@mefellows yes, cool, are you happy that this standaloneBinary property can be either local path or URI? Would be good to have it as URI so that we can use the same download logic present in install.ts.

uldissturms commented 6 years ago

Dwelling more on the same issue... Libraries such as PhantomJS and ChromeDriver don't seem to mind having an option to download artifacts from custom CDN / local locations. Would be interesting to see what were they security reasoning.

https://www.npmjs.com/package/phantomjs-prebuilt https://www.npmjs.com/package/chromedriver

Putting something like the following in .npmrc seems particularly appealing: pactnode_cdnurl=...

Mentioned modules come with lots of options:

Deciding Where To Get PhantomJS By default, this package will download phantomjs from our releases. This should work fine for most people.

Downloading from a custom URL If github is down, or the Great Firewall is blocking github, you may need to use a different download mirror. To set a mirror, set npm config property phantomjs_cdnurl.

Alternatives include https://bitbucket.org/ariya/phantomjs/downloads (the official download site) and http://cnpmjs.org/downloads.

npm install phantomjs-prebuilt --phantomjs_cdnurl=https://bitbucket.org/ariya/phantomjs/downloads Or add property into your .npmrc file (https://www.npmjs.org/doc/files/npmrc.html)

phantomjs_cdnurl=https://bitbucket.org/ariya/phantomjs/downloads Another option is to use PATH variable PHANTOMJS_CDNURL.

PHANTOMJS_CDNURL=https://bitbucket.org/ariya/phantomjs/downloads npm install phantomjs Using PhantomJS from disk If you plan to install phantomjs many times on a single machine, you can install the phantomjs binary on PATH. The installer will automatically detect and use that for non-global installs.

mefellows commented 6 years ago

I like this option best, and yes, local or http based seems reasonable to me. @mboudreau if it's good enough for chrome...?

@uldissturms would you be willing to submit a PR that also performs the version check logic? I'm flat out with other bits at the moment

uldissturms commented 6 years ago

@mefellows yes, IMO version check will happen by the fact that we look for a file that contains version in its name. I think it makes sense to be able to provide custom version through the same mechanism as well. Just waiting to hear back if @mboudreau is happy with the change - he was opposing strongly.

mefellows commented 6 years ago

Yep, letter ensure we have considered this from all angles before jumping in, Michel often finds another perspective

mboudreau commented 6 years ago

After some thought and talking to a security expert colleague of mine, we both agree that environment variables are a terrible way to handle binaries since it would be too easy to change it from another package and point to another binary that does something malicious.

What we did end up on is either having a pact.config file at the root of the project or having the config as part of package.json, as per your suggestion, since this would have to be explicitly set and cannot be as easily tampered with from a 3rd party library.

If this sounds acceptable, I can start working on this.

uldissturms commented 6 years ago

Config in package.json would work for us.

Not using an env variable for CDN URL still leaves this module open to other env var attacks: HTTP_PROXY / HTTPS_PROXY that are respected by request module.

I imagine malicious package would just download the bins and execute not worrying too much about altering other modules. Or modify other modules in a ways beyond env variable substitution. (monitor files + sed?)

Integrity checking the download could help a bit. npm have it built in now - maybe publishing archives as npm packages and having them as peer deps is not that bad after all. As far as maintenance - wouldn’t it be automated / scripted anyways?

mefellows commented 6 years ago

@uldissturms the problem with peer dependencies is that we have a separate one for each OS/distro, and use optional deps instead - this just meant more npm package management and publishing and bigger downloads (as npm package-log.json would contain an OS specific package, and then fail on a linux CI, for instance).

It was a mess, and there are a few issues around it.

This approach is 👍 for me.

uldissturms commented 6 years ago

@mefellows - thanks for clarifying, I can see the issue now. The amount of deps could grow to the number of platforms x architectures of contributors. Personally I wouldn't mind if all the tars are in a single package as internet is cheap and we never take testing packages above CI.

Anyways, let me know if you would like a help on PR as the one I put in few days ago would need to be amended only slightly to read URL from package.json instead of env var.

mboudreau commented 6 years ago

You don't, but we've had complaints from others saying the exact opposite since pact-node used to be about 200megs all up. This way we make sure we split it up to just download the minimum required binary.

We could look into doing a checksum on the version for the binary, that way it locks the binary for that particular version needed in pact-node. This would also make the security of the configuration (env vars, config, whatever) a moot point.

On Fri, May 4, 2018 at 7:22 AM Uldis Sturms notifications@github.com wrote:

@mefellows https://github.com/mefellows - thanks for clarifying, I can see the issue now. The amount of deps could grow to the number of platforms x architectures of contributors. Personally I wouldn't mind if all the tars are in a single package as internet is cheap and we never take testing packages above CI.

Anyways, let me know if you would like a help on PR as the one I put in few days ago would need to be amended only slightly to read URL from package.json instead of env var.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/pact-foundation/pact-node/issues/88#issuecomment-386440684, or mute the thread https://github.com/notifications/unsubscribe-auth/AAjA5D1vgEejdqIkHqPO4lBQIgrHxeLiks5tu3UPgaJpZM4Th6Pc .

mboudreau commented 6 years ago

Yep, I think I've decided to implement a checksum check on the binary with the ability to change the location. This way, we can pretty much guarantee that the correct binary is used.

Sorry this is taking a while and causing you pain. I'll try to get this done this week :)

mboudreau commented 6 years ago

@vrish88 @uldissturms There's a PR currently in play that will fix this issue for you. It's working pretty well when testing it out locally without compromising security of everyone by embedding the checksum within the package of the binaries.

Still a bit of work needed on it, but should go out soon enough.

uldissturms commented 6 years ago

thanks for the update @mboudreau.

mboudreau commented 6 years ago

@vrish88 @uldissturms I've packaged up the latest version and I was hoping you could try it out with your setup (place package at the root of your project, then change the package.json pact-node dependency to point to the package with "@pact-foundation/pact-node":"./pact-foundation-pact-node-6.17.0.tgz"). To configure pact to retrieve the binary from a particular location, set this config object in your package.json:

"config": {
    "pact_binary_location": "/your/binary/path"
}

It can also be a URL if that's how you're doing it. You'll need to download the 1.44.0 binaries from here:

It has to be the 1.44.0 binaries specifically or else it won't work.

Please try this before releasing this version to the wild. When doing my tests locally, this worked really well, but I don't know your setup or if I might have missed an edge case.

Thanks.

uldissturms commented 6 years ago

I can confirm that it works for me with local file path: npm i sumchecker && npm run postinstall from package dir.

Installing Pact Standalone Binary for darwin.
Downloading Pact Standalone Binary v1.44.0 for platform darwin from /Users/usturm/git/spike/pact-1.44.0-osx.tar.gz
Extracting binary from /Users/usturm/git/spike/node_modules/@pact-foundation/pact-node/standalone/pact-1.44.0-osx.tar.gz.
Checksum passed for 'pact-1.44.0-osx.tar.gz'.
Extraction done.

### If you ❤ Pact and want to support us, please donate here: http://donate.pact.io/node

Pact Standalone Binary is ready.

Thanks @mboudreau

mboudreau commented 6 years ago

@uldissturms Thanks for confirming. There's a slight issue on Windows that I need to fix and after that I'll release version 6.17.0. Cheers.

mboudreau commented 6 years ago

@vrish88 @uldissturms Version 6.17.0 is released. Feel free to use the latest version in your production system and let us know of any feedback you may have. Cheers.

vrish88 commented 6 years ago

Thanks a bunch for all the work! And yes, this is a way better solution than just an environment variable 😄

alshdavid commented 4 years ago

I am trying to install pact behind a private nexus npm repository which is inside a VPN.

When using

"config": {
  "pact_binary_location": "dist"
}

The install completes but it hangs on "Pact Standalone Binary is ready." under WSL2 & node 14/12

image

It will not proceed beyond this

EDIT: It works fine under Debian Linux on WSL2

TimothyJones commented 4 years ago

@alshdavid I suspect it might be a WSL problem rather than a pact problem, but running npm install --verbose might shed some light on what is going on.

If it still looks like a pact issue at that point, could you open a new issue with the log and (if possible) a reproducible example? Pact should work successfully in this case.