npms-io / npms-analyzer

The analyzer behind https://npms.io
MIT License
319 stars 38 forks source link

"Tarball has too many files" error causing search results to not update #215

Open jefflembeck opened 6 years ago

jefflembeck commented 6 years ago

If a tarball has too many files, the analyzer will not update.

See: https://github.com/Microsoft/Typescript

(50537 files)

This is, unfortunately, preventing Typescript from updating in search results:

see: https://npms.io/search?q=typescript or: https://www.npmjs.com/search?q=typescript

vs. https://www.npmjs.com/package/typescript

satazor commented 6 years ago

This was a protection we have added to the system. There were some packages that were saturating the I/O because they had a lot of files in it. I'm not sure what we could do here, //cc @bcoe

kgryte commented 6 years ago

@satazor Saturating in what sense? Too many open file descriptors? Too CPU intensive? Too expensive to analyze?

satazor commented 6 years ago

Too many file descriptors as well as filesystem writes/entries. For instance, usually it’s faster to write a single large file than multiples that total the same size.

What happens is that the I/O isn’t fast enough and it causes the whole system to lag.

satazor commented 6 years ago

Moreover, a tarball could have almost infinite small files in it. This would be a vector of attack because a well crafted tarball could fill up the filesystem max inodes. We can revisit the threshold for the maximum number of files but it was already quite generous.

satazor commented 6 years ago

The total number of files is 32000. We may increase it, what value would you think it’s reasonable?