dominictarr / JSONStream

rawStream.pipe(JSONStream.parse()).pipe(streamOfObjects)
Other
1.92k stars 165 forks source link

npmignore #75

Closed TrySound closed 9 years ago

TrySound commented 9 years ago

Add please tests to npm ignore for smaller package.

dominictarr commented 9 years ago

i like having the tests in the package. it doesn't really matter.

TrySound commented 9 years ago

@dominictarr But 380kb is too much for tests.

dominictarr commented 9 years ago

but why is 380k a problem? it's 2015!

TrySound commented 9 years ago

But I still have often slow internet.

TrySound commented 9 years ago

http://www.rudeshko.com/web/2014/05/13/help-people-consume-your-npm-packages.html

dominictarr commented 9 years ago

I feel your pain. I am from New Zealand, so i know what it's like to have slow internet. And I've travelled and had experienced bad wifi and 3g all over the world, from greenland, to morocco to cambodia.

But, the size of a module does not that much effect on the speed of the install. there are several reasons for this:

  1. a module like JSONStream does not change very often, npm caches it and you won't download it again unless it updates. (if you have npm logging enabled and you see 304 this has happened)
  2. modules that have had a lot of development, browserify is a good example, the registry data is likely to be larger than the actually code. for example:
curl -s registry.npmjs.com/browserify | wc -c
837186

nearly a megabyte!!! but the actual code for the latest version is 10x smaller!!!

curl -s http://registry.npmjs.org/browserify/-/browserify-11.0.1.tgz | wc -c
62775

it would be much better if each version of a package was it's own document, but this was a design decision made years ago before there where even 1000 modules in the npm registry and it seemed reasonable at the time.

  1. something like browserify takes a long time to install because it does a lot of round trips, rather than because of the size of the files. Even if a file isn't downloaded a request is still made to the npm registry to check if the version in your cache is still up to date. If you npm install regularily it's likely this is already up to date, so that wastes most of the time.

I just installed browserify and it took 26 seconds. It doesn't need to take this long! I am in berlin currently, with fairly good internet. if i was some where more remote this could take minutes.

Really, the slow installs are not because 1 module is large but because modules have lots of deps. But, this is only a problem because the way that npm is designed - it makes a request to a central server each time it needs know whether it's deps are up to date.

You could make installs very fast if you replicated the registry metadata (i.e. what packages a package depends on) locally. then you'd only need to talk to local database.

I started on a project to do this over here: https://github.com/dominictarr/npmd The replication is disabled right now, but i might get back to it sooner or later.

Another possiblity would be for @npm redesign so you made one request your the deps you want, it resolved the entire tree, and sent it back in one go, then you could request any tarballs you needed immediately. this would also be much faster!

TrySound commented 9 years ago

@dominictarr Thanks for answer!

dominictarr commented 9 years ago

no problem!

dominictarr commented 9 years ago

comment from @ceejbot https://twitter.com/ceejbot/status/629796144466432000 who is an engineer at npm, and is aware of this design possibility!

ceejbot commented 9 years ago

Please don't expect progress on this any time soon! I want to get there, however.

reqshark commented 9 years ago

well monolithic shit is almost definitionally slow

btw, that proposal is near to the mark. @dominictarr the last part of what you wrote is the answer IMO

to prove it we can just look at how fast npm publish is.

client's limited by severely disproportionate upload caps are always blazing fast.

dominictarr commented 9 years ago

@reqshark you can't compare install to publish, they are totally different. publish is slow for me, because i use a prepublish script npm ls && npm test but it's worth it because then I am less likely to publish broken code.

reqshark commented 9 years ago

@dominictarr interesting, but pre-publish seems like overkill since what's ought to be given to the working copy and up to the remote repo passes npm test before... eh assuming things get committed and pushed to a repo first before publishing to npm

anyway that's not what we were talking about at all! Sorry, in fact we can and we should compare install to publish! I think the description of the problem is a good premise to go on:

the slow installs are not because 1 module is large but because modules have lots of deps. But, this is only a problem because the way that npm is designed - it makes a request to a central server each time it needs know whether it's deps are up to date.

If that's true, then there could be too many small network connections slowing things down.. obviously that entails a lot of upfront latency and for what? All that TCP socket buffer handshaking between connect() and accept() for what? a byte of data to know the version? That stuff seems like well known knowledge to npm, everything on their org site lists all the link trees anyway so a simple message could be sent/stored across the tree as things are published up to the registry.

That's why I think the last part is the answer! It's all the intermediary TCP back and forth that slows it down. Do it in one shot and it's way faster. Try doing some npm publish without the pre-publsh and tell me I'm wrong :smile: