dominictarr / through

simple way to create a ReadableWritable stream that works
Other
669 stars 64 forks source link

Whitelist package files #44

Closed dguettler closed 8 years ago

dguettler commented 8 years ago

As part of ember-cli/ember-cli#6149, this PR aims to reduce the number of files downloaded for this package.

dominictarr commented 8 years ago

sorry, as a policy, I just include everything. 13 files is not a big deal.

Turbo87 commented 8 years ago

@dominictarr any reasons for including everything? once you have a larger application this is starting to become a problem if everybody does it...

stefanpenner commented 8 years ago

@dominictarr I appreciate your motivations & goals here are different then ours, so thank you for reviewing and confirming all files published are intended to be publish. We just wanted to make sure these files were not included by mistake, as they do add up resulting in a "death by a thousand papercuts" style scenario.

Anyways, thank you maintaining this module!

dominictarr commented 8 years ago

Occasionally I get pull requests for this sort of thing, usually because people want to improve install times. https://github.com/ember-cli/ember-cli/issues/6147#issuecomment-237439212

do you have data to show it's really download times?

I too wish that npm install could be faster, but I do not agree that this is a very good way to go about it. npm install could be (i reckon) at least an order of magnitude faster, but we need to campaign npm that install performance is something their users care about.

one problem is npm still uses https://www.npmjs.com/package/tar but https://www.npmjs.com/package/tar-stream is actually much faster. the other problem is round trips. npm does a lot of cache revalidation, which takes a whole round trip, this is often unnecessary, and a better design could avoid this.

I'm also sensitive to the problem npm install time and resource use (I do most of my coding solar powered, over 3g, because I live on a sailboat in new zealand, which of course, has pretty bad latency) but I don't think the solution is to adopt a culture that inhibits publishing. publishing friendlyness is the best thing about npm, we just need to get npm to take install performance seriously.

stefanpenner commented 8 years ago

do you have data to show it's really download times?

You are mistaken \w our motivation, we noticed many of our dependencies had very large number of accidental files (tmp directories, tests, benchmarks etc). Yes this may affect download, but the actual weight of several large directory trees causes either issues. For instance, merely creating the directory tree is quite costly, FS watching on some platforms OSX is recursive (not per file, or per level based), which compounds other watcher related issues. rm -rf takes a long time, etc.etc. This problem tends to get worse on slower platforms, network mounts etc.

In addition, working in constrained environments such as nitrous IO or cloud 9 become painful.

Ultimately compounds with other things, and sadness prevails.

Needless to say, we are merely doing some spring cleaning. And have shed thousands of files in the process.

I suspect through's file count isn't a big deal, rather as mentioned before we are merely double checking with all our dependencies.


I also agree, the npm performance issues can be result and agree with your estimate an order of magnitude improvement is possible. I would venture to say even more is possible. As there are many zero-copy style strategies that would make installs/reinstalls and the local cache work much better together.

Our goal is to tackles this from both sides. Reduce number of extra/accidental files (already resulted in thousands of less files, reduce the number of dependencies, and explore working with npm to help improve the current state.