npm / npm

This repository is moving to: https://github.com/npm/cli
http://npm.community
17.54k stars 3.03k forks source link

npm@2.1.3 ENOENT #6382

Closed dylang closed 9 years ago

dylang commented 9 years ago

npm-debug.log

Since that's truncated I also have the full silly log as well. I'll update this ticket when I have it in a gist, it's big.

Edit: Here's the silly log. npm@2.1.3 ENOENT 264,673-line log.

dylang commented 9 years ago

Ran the same install on Jenkins and my mbp and it only happened on Jenkins.

Got the same error with 2.1.2.

dylang commented 9 years ago

Note, for the silly log above you'll need to git checkout the gist.

git checkout https://gist.github.com/78436df3febaf913ee75.git
othiym23 commented 9 years ago

Exxxcellent. I look forward to analyzing it with avid anticipation. Thank you!

othiym23 commented 9 years ago

Hot damn, the new error-checking I added has already earned its keep:

npm ERR! install trying to install 1.2.12 to /mnt/workspace/x-web-widgets/DYLAN-widget-neighbor-comparison/node_modules/angular-bundle
npm ERR! install but already installed versions [ '1.2.12' ]

Elsewhere:

npm sill install resolved     peerDependencies:
npm sill install resolved      { 'x-web-global-common-bundle': '^1.2.0',
npm sill install resolved        'angular-bundle': '1.2.12' },

This is totally a peerDependency problem. Now I just have to figure out how, and to fix it. This is the greatest night of the last few weeks.

dylang commented 9 years ago

This is the greatest night of the last few weeks.

:sweat_smile:

othiym23 commented 9 years ago

Knowing where a problem like this is makes it so much easier to solve, especially when I have a transcript I know how to read telling me how it got to that state.

othiym23 commented 9 years ago

However your ops team is configuring nginx in front of your npm-registry-couchapp instance, it appears to be stripping the etag headers from the request, preventing CouchDB from sending 304s in response.

dylang commented 9 years ago

I spent about half the night learning how to use and configure chef, vagrant, and nginx to see if I could get the cache settings fixed, but my 1 year old enjoys being up through the night even more than I do, another indeterminate bug I've been trying to solve. :smirk: Hope to get the cache settings fixed today and rolled out early next week. I'll let you know if it helps with the issues.

othiym23 commented 9 years ago

Registry cachebusting (which is what nginx is causing to happen right now) can exacerbate existing race conditions, but it's not the root cause of this bug. The reasons to do this are that you'll have less internal network traffic and builds will run faster.

othiym23 commented 9 years ago

If you get a chance, would it be a tremendous pain to check out the othiym23/inflight-uris-not-specs branch on one of the jenkins instances that's been having trouble and get me another log? That branch changes how it tracks requests in flight and has additional / tweaked logging.

I can see the race condition happening in the log you captured yesterday, but I can't see the cause from the logs itself. It's possible this new version will fix your builds, but my primary goal is to get more insight into what's happening.

dylang commented 9 years ago

sure thing, trying it now! I'm able to get the error on my laptop so it's even easier to report a log to you. Do you want the npm-debug.log and the entire silly log, or is one of the two enough?

othiym23 commented 9 years ago

Just the silly log, please – it's a strict superset of what's in npm-debug.log.

dylang commented 9 years ago

Is the logging the only change? With that branch I just had my first successful npm install in a while with this particular repo.

dylang commented 9 years ago

Also, do you want the log of a successful install?

othiym23 commented 9 years ago

@dylang The change to how inflight tracks which requests are pending will definitely make the caching process less chaotic. I'd be delighted if that fixed your build issues, but also a little suspicious, because it just minimizes the number of requests pending to the registry (i.e. reduces the chances of hitting race conditions) rather than making any deep changes to control flow (that would actually remove the race conditions).

Also, do you want the log of a successful install?

Only if it comes paired with a failed install so I can compare the two. ;)

dylang commented 9 years ago

@othiym23 - I have a delightful pairing for you.

This gist contains two npm install logs using the same package.json

  1. Silly log with ENOENT error. Ran on Jenkins slave.
  2. Silly log without error. Ran on my MBP with SSD on VPN pretty far from the wifi hub.

Also:

othiym23 commented 9 years ago

I like that the build on Jenkins is crashing at the same place on the same dependency. This means that it's a marginally deterministic race condition, which also means that it's probably findable / fixable. I may have another instrumented build for you next week. Enjoy your weekend!

othiym23 commented 9 years ago

@dylang Can you run this on the build server and tell me if it fixes the problem?

$ npm config set cache-lock-stale 3600000
$ npm config set cache-lock-wait 20000
$ npm config set cache-lock-retries 20

Here is a very small excerpt from the logs you sent me. There are no timestamps, so I can't be sure whether what's happening is a bug or a staleness timeout, but:

npm verb lock using /mnt/npm-cache/_locks/angular-bundle-ea67d6865b9164cd.lock for /mnt/workspace/x-web-widgets/DYLAN-test-bill-compare/node_modules/angular-bundle
npm verb lock using /mnt/npm-cache/_locks/angular-bundle-ea67d6865b9164cd.lock for /mnt/workspace/x-web-widgets/DYLAN-test-bill-compare/node_modules/angular-bundle
npm verb unlock done using /mnt/npm-cache/_locks/angular-bundle-ea67d6865b9164cd.lock for /mnt/workspace/x-web-widgets/DYLAN-test-bill-compare/node_modules/angular-bundle

The only way that you could see two acquisitions of the lock in a row without an unlock in between is either a bug in the lockfile module or the lock timing out as stale. Since the default stale timeout is a minute, this is entirely conceivable in a big, complicated build like yours, so I'm just boosting the limits for everything by quite a bit to see what happens.

This still leaves the issue of determining why two installs of the same module are happening to the same place during the same run, but that is less of a pressing issue if the build isn't actually breaking.

othiym23 commented 9 years ago

Found it. Fixed it.

Not only was there a race condition in the installer (and a pretty obvious one), but the problem did in fact involve craziness with peerDependencies. See this comment for details, but the short version is that when you have a peerDependency that isn't being directly depended upon by any of your packages, but is declared as a peerDependency by multiple packages at the same level of the tree, it will be queued up for installation multiple times.

Since we lock on install (mostly to keep multiple install processes from stomping on the same node_modules directory), you'll be protected from this in most cases, but in the case of your build servers, they take long enough that they exceeds the stale lock timeout, which is a minute.

The fix for all of this is pretty simple, it just replaces the broken custom tracking logic for ensuring only one thing is installing to a given folder with the same inflight method the cache uses to avoid wasting time. I came up with a repro case of my own and have gone over it pretty closely, so I'm confident that this fixes this problem, which has just been sitting there waiting to nail people for quite some time now.

I'm confident enough that I've finally dealt with this that I'm going to close this issue; check out othiym23/peer-deps-must-die to test the fix on your build servers. Assuming it does in fact fix the build problems you've been seeing, I'll merge the branch and make a new release early next week.

othiym23 commented 9 years ago

You can also address this issue by adding angular-bundle to the package.json of the build that's been failing, but what's the fun in that?

dylang commented 9 years ago

Wow, awesome! Amazing work!

$ npm config set cache-lock-stale 3600000
$ npm config set cache-lock-wait 20000
$ npm config set cache-lock-retries 20

This fixed a bunch of repos, but caused (or exposed) a new error with a few of them. This is using 2.1.3, I haven't tried your latest build yet.

npm ERR! missing: minimatch@~0.2, required by karma@0.12.24
npm ERR! missing: http-proxy@~0.10, required by karma@0.12.24
npm ERR! missing: mime@~1.2.11, required by karma@0.12.24
npm ERR! missing: source-map@~0.1.31, required by karma@0.12.24
npm ERR! missing: readable-stream@~1.0.2, required by log4js@0.6.21
npm ERR! missing: cookie@0.1.0, required by connect@2.12.0
npm ERR! missing: negotiator@0.3.0, required by connect@2.12.0
npm ERR! missing: multiparty@2.2.0, required by connect@2.12.0
npm ERR! missing: mime@~1.2.9, required by send@0.1.4
npm ERR! missing: range-parser@0.0.4, required by send@0.1.4

These are all things that are in the tree multiple times.

For example

jenkins@ip-10-40-64-233:~/workspace/x-web-widgets/api-neighbor-comparison-MASTER$ npm ls minimatch
npm info it worked if it ends with ok
npm verb cli [ 'node', '/home/jenkins/.npm/bin/npm', 'ls', 'minimatch' ]
npm info using npm@2.1.3
npm info using node@v0.10.26
npm verb node symlink /usr/bin/node
api-neighbor-comparison@1.4.1-0 /mnt/workspace/x-web-widgets/api-neighbor-comparison-MASTER
├─┬ grunt@0.4.5
│ ├─┬ findup-sync@0.1.3
│ │ └─┬ glob@3.2.11
│ │   └── minimatch@0.3.0
│ └── minimatch@0.2.14
├─┬ grunt-cli@0.1.13
│ └─┬ findup-sync@0.1.3
│   └─┬ glob@3.2.11
│     └── minimatch@0.3.0
└─┬ x-web-taskmaster@27.1.1
  ├─┬ glob@3.2.11
  │ └── minimatch@0.3.0
  ├─┬ grunt-build-bubble@1.7.7
  │ └─┬ transform-package@1.1.0
  │   ├─┬ fstream-npm@0.1.8
  │   │ └─┬ fstream-ignore@1.0.1
  │   │   └── minimatch@1.0.0
  │   └─┬ x-web-npm@2.1.5
  │     ├─┬ glob@4.0.6
  │     │ └── minimatch@1.0.0
  │     └─┬ npm@2.1.2
  │       └── minimatch@1.0.0
  ├─┬ grunt-contrib-jshint@0.10.0
  │ └─┬ jshint@2.5.6
  │   └── minimatch@1.0.0
  ├─┬ grunt-contrib-watch@0.6.1
  │ └─┬ gaze@0.5.1
  │   └─┬ globule@0.1.0
  │     └── minimatch@0.2.14
  ├─┬ grunt-jscs-checker@0.4.4
  │ └─┬ jscs@1.4.5
  │   └── minimatch@0.2.14
  ├─┬ grunt-mocha-test@0.10.2
  │ └─┬ mocha@1.18.2
  │   └─┬ glob@3.2.3
  │     └── minimatch@0.2.14
  ├─┬ grunt-object-storage@2.2.1
  │ ├─┬ glob@4.0.6
  │ │ └── minimatch@1.0.0
  │ ├─┬ object-storage-lookup@3.3.0
  │ │ └─┬ glob@3.2.11
  │ │   └── minimatch@0.3.0
  │ └─┬ transform-package@1.1.0
  │   ├─┬ fstream-npm@0.1.8
  │   │ └─┬ fstream-ignore@1.0.1
  │   │   └── minimatch@1.0.0
  │   ├─┬ glob@3.2.11
  │   │ └── minimatch@0.3.0
  │   └─┬ x-web-npm@2.1.5
  │     ├─┬ glob@4.0.6
  │     │ └── minimatch@1.0.0
  │     └─┬ npm@2.1.2
  │       └── minimatch@1.0.0
  ├─┬ grunt-task-test-harness@1.0.6
  │ └─┬ transform-package@1.1.0
  │   ├─┬ fstream-npm@0.1.8
  │   │ └─┬ fstream-ignore@1.0.1
  │   │   └── minimatch@1.0.0
  │   └─┬ x-web-npm@2.1.5
  │     ├─┬ glob@4.0.6
  │     │ └── minimatch@1.0.0
  │     └─┬ npm@2.1.2
  │       └── minimatch@1.0.0
  ├─┬ grunt-theme-firm-job@10.0.5
  │ ├─┬ sassage-factory@3.2.1
  │ │ ├─┬ glob@4.0.6
  │ │ │ └── minimatch@1.0.0
  │ │ ├─┬ node-sass@0.9.4
  │ │ │ └─┬ mocha@1.18.2
  │ │ │   └─┬ glob@3.2.3
  │ │ │     └── minimatch@0.2.14
  │ │ └─┬ stylestats@4.1.4
  │ │   └─┬ stylus@0.48.1
  │ │     └─┬ glob@3.2.11
  │ │       └── minimatch@0.3.0
  │ ├─┬ transform-package@1.1.0
  │ │ └─┬ fstream-npm@0.1.8
  │ │   └─┬ fstream-ignore@1.0.1
  │ │     └── minimatch@1.0.0
  │ └─┬ x-web-npm@2.1.5
  │   ├─┬ glob@4.0.6
  │   │ └── minimatch@1.0.0
  │   └─┬ npm@2.1.2
  │     └── minimatch@1.0.0
  ├─┬ grunt-theme-with-bubble@1.1.0
  │ ├─┬ transform-package@1.1.0
  │ │ └─┬ fstream-npm@0.1.8
  │ │   └─┬ fstream-ignore@1.0.1
  │ │     └── minimatch@1.0.0
  │ └─┬ x-web-npm@2.1.5
  │   ├─┬ glob@4.0.6
  │   │ └── minimatch@1.0.0
  │   └─┬ npm@2.1.2
  │     └── minimatch@1.0.0
  ├─┬ matchdep@0.3.0
  │ └─┬ globule@0.1.0
  │   └── minimatch@0.2.14
  └─┬ mocha@1.21.4
    └─┬ glob@3.2.3
      └── minimatch@0.2.14

npm verb exit [ 0, true ]
npm info ok
dylang commented 9 years ago

I'm trying the same package above with othiym23/peer-deps-must-die and I'll post a silly log if it has the same missing modules problem.

dylang commented 9 years ago

Good news! I re-ran a failed install (one that was missing dependencies) with othiym23/peer-deps-must-die and all dependencies were present.

dylang commented 9 years ago

Doh... I'm seeing missing dependencies with othiym23/peer-deps-must-die too.

missing-packages.

I'm running it again now to see if it happens again.

dylang commented 9 years ago

I ran it two more times and it had the same missing packages.

othiym23 commented 9 years ago

hahahahahahahaha sobs

Well, the good news is that this is a separate issue. The bad news is that it's probably a consequence of the last fix related to just allowing multiple installs to piggyback on one actual write of a module. I'll see what I can sift out from the logs (thanks for uploading them).

othiym23 commented 9 years ago

See #6401 for a continuation of the missing packages discussion.