nodejs / node-v0.x-archive

Moved to https://github.com/nodejs/node
34.43k stars 7.31k forks source link

Node's nested node_modules approach is basically incompatible with Windows #6960

Closed jonrimmer closed 10 years ago

jonrimmer commented 10 years ago

Node needs an alternative approach to endless, recursively nested node_modules folders on Windows. Most Windows tools, utilities and shells cannot handle file and folder paths longer than 260 characters at most. This limit is easily exceeded, and once it is, install scripts start breaking and node_modules folders can no longer be deleted using conventional methods.

This will only get worse as Node packages get more complicated, with bigger and deeper dependency hierarchies. There should be a way to install and run packages that does not use file system recursion to represent the dependency hierarchy.

Mithgol commented 10 years ago

Suggestion:   to mitigate the above mentioned problem, make Node.js version 1.0 (and the corresponding npm) use n_m (3 characters) instead of node_modules (12 characters) by default. This change is expected to double the possible hierarchy depth for modules with names containing nine letters or less.

If a backwards compatibility is necessary, node_modules could still be supported (though not as default). For example,

OrangeDog commented 10 years ago

That doesn't solve the problem, just gives you slightly more breathing room.

The only real solution is going to have all modules at a single level, requiring each other, instead of all having private copies of their dependencies. This would either have to be versioned, and/or module maintainers need to be more careful about breaking changes. Most package managers work this way, and seem to be getting along nicely.

indutny commented 10 years ago

As an idea: node_modules/module@version

Mithgol commented 10 years ago

The module@version solution eliminates the problem much better than n_m, but not without introducing a bunch of other problems.

Example: if somemodulename@1.2.3 and somemodulename@4.5.6 are both present, what should npm update somemodulename do?

In the same example, what should npm uninstall somemodulename do?

randunel commented 10 years ago

@Mithgol it should probably install somemodule@la.te.st, alongside the others. But npm purge or npm clean should probably be added.

OrangeDog commented 10 years ago

@Mithgol if the latest version is 4.5.6 do nothing, otherwise add the latest version. If you can be clever and determine that a previous version is no longer needed then remove it, otherwise don't worry about it.

@randunel somemodule@latest would be far more sensible if there's an actual need for it, though for backwards-compatibility, I'd guess somemodule would symlink to the latest anyway.

jdalton commented 10 years ago

:+1:, devs are running into this with individual lodash packages too. See https://github.com/lodash/lodash/issues/501, https://github.com/gulpjs/gulp-util/pull/23, & twitter/theporchrat.

There looks to be the start of addressing the issue here, but it may need to be used in other methods. Related to #6891.

jzaefferer commented 10 years ago

Some docs for the code @jdalton linked to: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath

Mithgol commented 10 years ago

I guess Node's developers would have to rewrite both require() and npm to use \\?\… paths. If it's possible, 256-character limitation becomes void — even without resorting to n_m or @version.

(Maybe not exactly require() and npm, but rather the entire underlying fs module, also process.cwd(), etc.)

jonrimmer commented 10 years ago

Even if all of Node and NPM are rewritten to use long paths, so long as NPM supports scripts there is a danger of them randomly blowing up when run on Windows due to finding themselves inside a very deep hierarchy then trying to run utilities that don't support long paths.

Mithgol commented 10 years ago

@JonRimmer Many third-party scripts for NPM (such as Grunt, or Mocha, or node-pre-gyp) are Node.js scripts and thus they are to become automagically fixed when Node.js core modules (such as fs and path) and path-related methods (such as process.cwd()) all start using \\?\… unilaterally.

The other utilities would indeed break and thus the community would have to replace them or demand upgrades. (A similar ruckus once happened when Node.js started supporting Windows and many UN*X-only tools and CLI scripts experienced a lot of incompatibility-related problems. There was much suffering; however, the community eventually fixed the problems or developed some workarounds, sometimes as simple as using path.join instead of a former + '/' +. Or using os.EOL instead of a former \n.)

langri-sha commented 10 years ago

Hello everyone! I was unpleasantly surprised when I found out our module was hitting an unexpected wall in Windows host environments. If anyone needs a quick remedy:

domenic commented 10 years ago

FYI Node (and thus npm) always uses UNC paths internally, before actually calling into the filesystem. It is only third-party tools that have a problem. (Unfortunately one of those third-party tools is Windows Explorer, but, that's Microsoft's bug...)

jdalton commented 10 years ago

FYI Node (and thus npm) always uses UNC paths internally

I do see the _makeLong() helper used a lot in fs though is there a chance something was missed? Is it handled in require?

jonrimmer commented 10 years ago

@domenic Microsoft have said many times that MAX_PATH limitations are not considered a bug, just a feature of Windows. The file system supports long paths, the OS does not, there is a difference.

If Node's intention is to support NTFS, then long paths are fine. If the intention is to support Windows, then there needs to be an alternative to long paths.

domenic commented 10 years ago

@JonRimmer The OS does support long paths, or at least all API methods that libuv uses support long paths. So the distinction you are trying to make does not apply.

@jdalton require delegates to fs.

jdalton commented 10 years ago

@domenic Cool.

I'm experimenting with this on my Windows 8.1 machine (I haven't hit this issue on my own yet):

jonrimmer commented 10 years ago

@domenic Microsoft: "In the Windows API (with some exceptions discussed in the following paragraphs), the maximum length for a path is MAX_PATH, which is defined as 260 characters." [1]

Some (but not all) Windows APIs support, as an alternative, unicode paths up to 32,767 chars, but this is an alternative - not the default. It is not the option used in many Windows utilities including core parts of the OS. Claiming therefore, that Windows supports long paths, is like claiming old NT was Unix compatible because there is a POSIX subsystem, or claiming Linux is compatible with the win32 APIs because you can install Wine. It is a entirely disingenuous attempt to misrepresent the actual reality. Windows as an OS does not support long paths. That it is possible to write Windows compatible software that supports long paths does not change that fact.

[1] http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

domenic commented 10 years ago

I don't really care about whatever word games you want to play around the word "support." Sure, if you wish, all I'm saying is that Node and npm are Windows compatible software that support long paths.

jonrimmer commented 10 years ago

You are the one playing games - calling core parts of Windows like Explorer "3rd party tools", and suggesting that not supporting long paths is a bug. Microsoft have made it clear repeatedly that non-support for long paths is not a bug, and not something that will change.

A package manager creating paths that do not work with the majority of the software written for an OS, then claiming compatibility with the OS, is playing games, at your users expense.

edef1c commented 10 years ago

I guess we'll have to stick with just being compatible with node and npm for Windows then. To argue that the node module system should change due to deficiencies in parts of Windows is a dead end.

matt-bernhardt commented 10 years ago

Well, I guess that writes off my ability to use node (or at least some node projects) under a Windows environment, then? I've been attempting to merge a repository elsewhere on Github that uses node, and the merge command fails due to the length of the pathnames.

Maybe there's a workaround I can use (seriously, I'm open to suggestions) - or maybe you can argue that the problem lies with how the Git command line deals with directory paths - but the end result is the same: I can't work with this node project under Windows.

OrangeDog commented 10 years ago

Nested dependenices shouldn't be in git repositories. Each dependency has its own repository, and npm assembles them. You could always open a ticket with git to add support for long paths, just like node does.

dougwilson commented 10 years ago

Or you can check out the git repo directly under C: ?

matt-bernhardt commented 10 years ago

@OrangeDog - I'm looking into ways that I can work with this repository specifically - I'm not sure how it came to be this way, but it is unworkable currently. My completely-off-the-cuff speculation would be trying to .gitignore the node_modules directory, trusting the node system to build the deep directories correctly.

@dougwilson - that was the first thing I tried, but no dice - even putting the repo at c:/g/ the paths are still too long:

c:/g/node_modules/grunt-contrib-imagemin/node_modules/image-min/node_modules/gifsicle/node_modules/bin-wrapper/node_modules/download/node_modules/request/node_modules/form-data/node_modules/combined-stream/node_modules/delayed-stream/test/integration

bmeck commented 10 years ago

Just a comment. This appears to be problems with tools that do not support long paths in Windows; is this truly a Node issue or just a common problem with Windows applications?

Mithgol commented 10 years ago

So, Node and npm both work with long paths. That's good. But… what if a module contains a C/C++ addon? Can Python and Visual Studio actually build it if a path is long?

isochronous commented 10 years ago

The most annoying part of this bug to me is that I can't move or delete folders with long path names inside of them. To do so, I either have to crawl the directory tree manually, slicing out portions and pasting them into a temp folder, or resort to a cygwin shell, which will actually move/delete even very long path names.

One of the things that struck me was that people seem to rely strongly on all of these miniscule lodash mini-packages. Is lodash truly so large, and your requirements truly so strict, that you can't just require all of lodash? If you require a certain lodash mini-package, that package will require other lodash-mini packages, which in turn require their own lodash mini-packages, to the point where you've basically included half of lodash, and your startup times are going to be far higher than they would have been with the full lodash package just from having to load so many individual modules.

The gulp-jshint module is a good example of this. Install it and then trace down the node_modules tree, and see how ridiculously far down the rabbit hole goes.

jdalton commented 10 years ago

One of the things that struck me was that people seem to rely strongly on all of these miniscule lodash mini-packages. Is lodash truly so large, and your requirements truly so strict, that you can't just require all of lodash?

Modules are great and large dep graphs can and do happen. Trying to poo-poo a dev for using small reusable modules is wonky. Things should "just work".

If you require a certain lodash mini-package, that package will require other lodash-mini packages, which in turn require their own lodash mini-packages, to the point where you've basically included half of lodash, and your startup times are going to be far higher than they would have been with the full lodash package just from having to load so many individual modules.

There are pros & cons to monolithic vs individual module vs bundles of modules.

The gulp-jshint module is a good example of this. Install it and then trace down the node_modules tree, and see how ridiculously far down the rabbit hole goes.

To avoid this issue in the next release Lo-Dash core is inlining ~60 functions into various modules which greatly reduces the dep graphs (many to 0 deps) at the cost of duplication. It's a bummer of a band-aid but it's the state of things at the moment.

othiym23 commented 10 years ago

To avoid this issue in the next release Lo-Dash core is inlining ~60 functions into various modules which greatly reduces the dep graphs (many to 0 deps) at the cost of duplication. It's a bummer of a band-aid but it's the state of things at the moment.

From a maintainability and quality perspective this kind of sucks, but from a performance perspective this is a win, because those functions will be defined in the same V8 context and can thus be inlined when the various pieces of lodash get hot, which can't be done when helpers are defined in separate modules.

jdalton commented 10 years ago

From a maintainability and quality perspective this kind of sucks,

Generally yes, but for Lo-Dash maintainability and quality aren't a problem because it's all generated from the same monolithic reference source and all modules are run through the same unit tests. So for me it was just adding a list of always-inlined modules and adding a heuristic to inline functions with only one dependent. Even though functions are inlined the individual modules for each are still created so devs can still use every bit and bob.

but from a performance perspective this is a win, because those functions will be defined in the same V8 context

Good to know.

jdalton commented 10 years ago

Moving the thread back on track. About two weeks ago I did some experiments to narrow down the issue. I couldn't come up with a case where Node/npm failed however I did find gotchas in the Windows UI/command-line.

For devs running into these issues is there a recommended workaround or best practice? If so, is it documented?

A bigger question would be is there another dependency structure that Node could use that would avoid the nested issue?

phated commented 10 years ago

Haven't read all of this thread, but couldn't this be solved by NPM doing a dedupe on postinstall or at least allowing the npm dedupe command to have a --save option to save the deduped tree?

mstade commented 10 years ago

Aside from being a substantial piece of work, are there any real no-can-do issues with changing the structure to be flat, like the previously suggested node_modules/module@version?

The issues mentioned aren't things that will fundamentally break node, just things that need some thinking to come up with a decent solution. Keeping on with the nested folders approach however is causing real problems, regardless of what one might think of windows explorer or other applications that don't play nice with long paths.

vkurchatkin commented 10 years ago

@mstade what require('module-name') is supposed to do if there is node_modules/module-name@version1 and node_modules/module-name@version2?

mstade commented 10 years ago

@vkurchatkin node would need a new look-up algorithm of course, presumably making use of package.json in order to figure out version information and such.

I doubt that it's impossible to come up with a new algorithm that works with a flat folder structure. The current algorithm is simple and intuitive, but the assumption that deep folder nesting is A-OK doesn't hold up on a major OS, so it might be worth spending a few brain cells figuring out a more fitting solution.

I don't think this is rocket surgery; but I also don't doubt it'd be a fair bit of work to implement, particularly in making sure existing projects don't get shafted.

bmeck commented 10 years ago

@mstade There are several issues at hand with a naive flatten of all modules with the same version to a single folder. The following is not a complete list, but are real world problems that would need to be argued against as it could/would damage experiences on other OS.

  1. Shared state expectations in modules is broken. See the following as a simple example:

    var uid = 0;
    module.exports = function next() {
    return ++uid;
    }

    If this is originally 2 submodules and is then converted into a single one (name@version); the next() function can no longer be guaranteed to be incrementing by 1 since it may (very dangerous word) be consumed by other modules.

  2. This does not fit with the current require's algorithm, which is very stable and changing it would have dramatic consequences. In particular, you will need to be sure to tag to a specific version of submodules rather than having possible collisions.
  3. This does not work with how some plugin/nested modules work, where they require('../../..') to interact with a parent module since the parent module has been flattened.
  4. Collisions become much more of an issue since you can have multiple possible matches in a node_modules directory with different versions. Imagine require('project_assets/api.js'). Having this resolve to multiple things can introduce much complexity to a simple system.
mstade commented 10 years ago

1) will increment by one regardless, but you couldn't make the assumption that next won't be called from another module that shares the same dependency. Arguably, this code is not particularly good, and the current algorithm enables it, but I realize that's not a particularly good argument for flattening the dependencies – breaking existing code never helped anyone.

3) is again bad code making too many assumptions, just as 1). Of course, the current algorithm enables it, so that's unfortunate. Again, I realize pointing fingers at bad code and suggesting they suit themselves isn't a viable option, just stating some obvious things.

I'll assume 2) is a continuation of 1) and you're absolute right – if this is a widespread pattern then it would have pretty dramatic consequences. I didn't know (and still don't, really) this was a common pattern. I don't use and haven't seen either 1) or 2) before; please forgive my ignorance.

I think 4) has a straight forward solution in using the package.json metadata.

The examples you bring up with code making too many assumptions (enabled by the current algorithm) are a good argument against flattening, since it would indeed introduce some serious complexity. My gut tells me this is a problem that can be overcome, but off hand I have no reasonable suggestions.

Thanks for the insights @bmeck, I learned something.

briandipalma commented 10 years ago

As an idea: node_modules/module@version

I like this idea. It would be nice, if such a large change were ever to occur, to also rename node_modules to packages, it would nicely compliment package.json and remove the word node which makes certain developers believe npm is a tool for node development and nothing else.

mstade commented 10 years ago

+1 @briandipalma

trevnorris commented 10 years ago

I might be stating the obvious, but isn't this a discussion to be had on https://github.com/npm/npm ?

Mithgol commented 10 years ago

Implementing the above discussed changes would require some Node.js changes outside of npm as well; for example, require. (Might also affect node-gyp, I dunno. My comment https://github.com/joyent/node/issues/6960#issuecomment-41993270 is still unanswered.)

dariuszp commented 10 years ago

Example: if somemodulename@1.2.3 and somemodulename@4.5.6 are both present, what should npm update somemodulename do?

Nothing. My exdi module as example:

npm install exdi -> install latest exdi in "exdi" directory, npm update will install "exdi" npm install exdi@1.1 -> install exdi 1.1 in "exdi@1.1" directory and leave it alone

require is a problem but we could follow exact same convention so if You require "exdi" it will look for "exdi" and then for "exdi@*" and if we require "exdi@1.1" it will look for "exdi@1.1" directory

If there is a package.json with "exdi: 1.1" then require('exdi') will always require version 1.1.

In the same example, what should npm uninstall somemodulename do?

Again, same thing as above. This way we can have multiple versions of one package without need to have nested structure. When parent directory is node_modules, require could look into current directory for deps and not into subdirectory "node_modules".

khaledh commented 10 years ago

I'm new to node development on Windows, and this is basically a showstopper. The lodash packages are particularly problematic, and cause some installs to hang silently. The problem is exacerbated when trying to clean up the failed installation manually, which doesn't work because of the long path problem.

This is, IMO, a very bad experience for people starting to develop with node on Windows, which most likely will drive them away. If the node/npm teams think this is a problem in Windows itself, and they're not willing to make it work, fine, but please don't provide Windows installers and don't claim Windows support.

Mithgol commented 10 years ago

This all-or-nothing approach is no good. Of course it is essential to continue providing Windows installers for Node.js.

bmeck commented 10 years ago

@khaledh node/npm itself should work fine, what part of the install hangs and can you provide and logs npm-debug or other around this. The only thing which should fail is other windows programs which do not support long paths, such as cmd.exe

hyrmedia commented 10 years ago

I develop on Windows, and this is a huge problem for me right now.

mstade said:

I doubt that it's impossible to come up with a new algorithm that works with a flat folder structure. The current algorithm is simple and intuitive, but the assumption that deep folder nesting is A-OK doesn't hold up on a major OS, so it might be worth spending a few brain cells figuring out a more fitting solution.

I don't think this is rocket surgery; but I also don't doubt it'd be a fair bit of work to implement, particularly in making sure existing projects don't get shafted.

I couldn't agree more. I can understand concerns about taking the time to design & implement a solution, but from both an architecture and a usability perspective, this seems like it should be a top priority. For the end-developer, maintaining endlessly-nested levels of "node_modules" directories is vastly more difficult to support and maintain, on Windows or otherwise, than a simple flat list of "module@version" entries -- but the blatant brokenness of the situation on Windows is beyond frustrating.

Btw -- I know there are several issues to address, but specifically regarding the question about require() and which module version it should actually be requiring -- isn't that easily determined from the version referenced in each module's own packages.json manifest?

edef1c commented 10 years ago

@hyrmedia The only field in package.json that node looks at is main. node's require algorithm is entirely unaware of module versions. it's also a frozen part of node core. As far as I can tell, nobody is particularly enthusiastic about changing the behaviour of npm or node entirely for Windows alone — a platform few of us use, even fewer of us like, and only a couple of us care about. node has done the correct thing by handling long paths correctly, and at this point it feels like we're being asked to compensate for all the software around us on Windows being crap.

isochronous commented 10 years ago

Could you please not use phrases like "a platform few of us use" ? The last two jobs I've had have been exclusively windows shops, and both of them have used node extensively. Don't make generalizations based only on your personal experience. I do like windows, and care a lot about node compatibility on windows, and I know many others aside from the multiple people represented in this thread that do too. I'm not sure exactly what you mean by "node has done the correct thing by handling long paths correctly," other than the fact that node supports long paths, which is not so much a node feature as it is a file system feature of the OS you're on. Since windows continues to hold roughly 90% of the OS market, it seems like increasing node's compatibility with the platform can only benefit node in the long run. In addition, I feel like the module@version approach would have at least two major benefits on any OS compared to the current approach - eliminating duplication and reducing complexity.

notac commented 10 years ago

The "Software around us on Windows" that rightly or not that are referred to as crap are core utilities. Developers on Windows would have to do a lot of extra work to avoid getting jammed up on this.

Yes, in the ideal world we wouldn't have to deal with this. But developers work in an environment that is dirty and messy, and they just need to get stuff done. Refusing to recognize reality is not a viable position. We are being asked to compensate for the good of the project, and we should.

I don't use windows, but I know a lot of developers who do, and who would like to use node. Please, let's just fix this (or compensate/whatever). Unless we really just want to cut out what is still a majority of the potential market for node.