nodejs / node-gyp

Node.js native addon build tool
MIT License
9.91k stars 1.79k forks source link

[8.0.0] published npm package contains unused files (tests; repo config) #2372

Closed AviVahl closed 12 months ago

AviVahl commented 3 years ago

Published npm package contains repo configuration/scripts, "test" folder, etc. May want to use the "files" field in package.json and specify only the actually used files/folders.

$ tree node_modules/node-gyp/ -a
node_modules/node-gyp/
├── addon.gypi
├── bin
│   └── node-gyp.js
├── CHANGELOG.md
├── CONTRIBUTING.md
├── .github
│   ├── ISSUE_TEMPLATE.md
│   ├── PULL_REQUEST_TEMPLATE.md
│   └── workflows
│       └── tests.yml
├── gyp
│   ├── AUTHORS
│   ├── CHANGELOG.md
│   ├── CODE_OF_CONDUCT.md
│   ├── CONTRIBUTING.md
│   ├── data
│   │   └── win
│   │       └── large-pdb-shim.cc
│   ├── .flake8
│   ├── .github
│   │   └── workflows
│   │       ├── node-gyp.yml
│   │       ├── nodejs-windows.yml
│   │       ├── Python_tests.yml
│   │       └── release-please.yml
│   ├── gyp
│   ├── gyp.bat
│   ├── gyp_main.py
│   ├── LICENSE
│   ├── pylib
│   │   └── gyp
│   │       ├── common.py
│   │       ├── common_test.py
│   │       ├── easy_xml.py
│   │       ├── easy_xml_test.py
│   │       ├── flock_tool.py
│   │       ├── generator
│   │       │   ├── analyzer.py
│   │       │   ├── android.py
│   │       │   ├── cmake.py
│   │       │   ├── compile_commands_json.py
│   │       │   ├── dump_dependency_json.py
│   │       │   ├── eclipse.py
│   │       │   ├── gypd.py
│   │       │   ├── gypsh.py
│   │       │   ├── __init__.py
│   │       │   ├── make.py
│   │       │   ├── msvs.py
│   │       │   ├── msvs_test.py
│   │       │   ├── ninja.py
│   │       │   ├── ninja_test.py
│   │       │   ├── xcode.py
│   │       │   └── xcode_test.py
│   │       ├── __init__.py
│   │       ├── input.py
│   │       ├── input_test.py
│   │       ├── mac_tool.py
│   │       ├── msvs_emulation.py
│   │       ├── MSVSNew.py
│   │       ├── MSVSProject.py
│   │       ├── MSVSSettings.py
│   │       ├── MSVSSettings_test.py
│   │       ├── MSVSToolFile.py
│   │       ├── MSVSUserFile.py
│   │       ├── MSVSUtil.py
│   │       ├── MSVSVersion.py
│   │       ├── ninja_syntax.py
│   │       ├── simple_copy.py
│   │       ├── win_tool.py
│   │       ├── xcode_emulation.py
│   │       ├── xcode_ninja.py
│   │       ├── xcodeproj_file.py
│   │       └── xml_fix.py
│   ├── README.md
│   ├── requirements_dev.txt
│   ├── setup.py
│   ├── test_gyp.py
│   └── tools
│       ├── emacs
│       │   ├── gyp.el
│       │   ├── gyp-tests.el
│       │   ├── README
│       │   ├── run-unit-tests.sh
│       │   └── testdata
│       │       ├── media.gyp
│       │       └── media.gyp.fontified
│       ├── graphviz.py
│       ├── pretty_gyp.py
│       ├── pretty_sln.py
│       ├── pretty_vcproj.py
│       ├── README
│       └── Xcode
│           ├── README
│           └── Specifications
│               ├── gyp.pbfilespec
│               └── gyp.xclangspec
├── lib
│   ├── build.js
│   ├── clean.js
│   ├── configure.js
│   ├── find-node-directory.js
│   ├── find-python.js
│   ├── Find-VisualStudio.cs
│   ├── find-visualstudio.js
│   ├── install.js
│   ├── list.js
│   ├── node-gyp.js
│   ├── process-release.js
│   ├── rebuild.js
│   ├── remove.js
│   └── util.js
├── LICENSE
├── macOS_Catalina_acid_test.sh
├── macOS_Catalina.md
├── package.json
├── README.md
├── src
│   └── win_delay_load_hook.cc
├── test
│   ├── common.js
│   ├── fixtures
│   │   ├── ca-bundle.crt
│   │   ├── ca.crt
│   │   ├── server.crt
│   │   ├── server.key
│   │   ├── test-charmap.py
│   │   ├── VS_2017_BuildTools_minimal.txt
│   │   ├── VS_2017_Community_workload.txt
│   │   ├── VS_2017_Express.txt
│   │   ├── VS_2017_Unusable.txt
│   │   ├── VS_2019_BuildTools_minimal.txt
│   │   ├── VS_2019_Community_workload.txt
│   │   └── VS_2019_Preview.txt
│   ├── process-exec-sync.js
│   ├── simple-proxy.js
│   ├── test-addon.js
│   ├── test-configure-python.js
│   ├── test-download.js
│   ├── test-find-accessible-sync.js
│   ├── test-find-node-directory.js
│   ├── test-find-python.js
│   ├── test-find-visualstudio.js
│   ├── test-install.js
│   ├── test-options.js
│   └── test-process-release.js
└── update-gyp.py

20 directories, 126 files
cclauss commented 3 years ago

How much space would such a move save. I would be a bit worried about unintended consequences.

AviVahl commented 3 years ago

In terms of raw size in kb? or percent of the published package? Those depend on which files/folders should be included.

I can guess that test and .github should probably stay out. I'm not certain about folders like node-gyp/gyp/tools/emacs though.

AviVahl commented 3 years ago

oh, and tree ignored dot files/folders... I'll update the list in OP with -a.

DeeDeeG commented 3 years ago

How much space would such a move save

I was able to go from 2.1 MB with 146 files/folders, to 1.3 MB with 89 files/folders.

(There are still a lot of files which I'm not sure what they do, which I left alone. And that's not really tested for breakage yet, but it's about how much I expect we could save. Carefully deleting more gyp generators would yield the most file size savings, if possible.)


In addition to @AviVahl's suggestions, there are also some generators as part of gyp which we do not use in node-gyp at the moment. The gyp folder seems to be the heaviest part of the module. We could, for example, exclude the ninja generator, and a few others (android, cmake perhaps).

We could do this selectively with .gitignore (affects git tracking) or .npmignore (doesn't affect git tracking).

Here are the folder contents by size on disk:

Output of `ncdu`, a text-based disk usage analyzer (click to expand) The `node-gyp` v8.0.0 tarball, gotten via `npm pack node-gyp` and extracted: ``` 1.7 MiB [##########] /gyp 228.0 KiB [# ] /test 112.0 KiB [ ] /lib 68.0 KiB [ ] CHANGELOG.md 12.0 KiB [ ] README.md 12.0 KiB [ ] /.github 8.0 KiB [ ] macOS_Catalina.md 8.0 KiB [ ] addon.gypi 4.0 KiB [ ] /bin 4.0 KiB [ ] update-gyp.py 4.0 KiB [ ] CONTRIBUTING.md 4.0 KiB [ ] package.json 4.0 KiB [ ] LICENSE 4.0 KiB [ ] /src 4.0 KiB [ ] macOS_Catalina_acid_test.sh ``` The `gyp` subfolder: ``` 1.3 MiB [##########] /pylib 276.0 KiB [## ] /tools 16.0 KiB [ ] /.github 8.0 KiB [ ] test_gyp.py 8.0 KiB [ ] CHANGELOG.md 4.0 KiB [ ] LICENSE 4.0 KiB [ ] setup.py 4.0 KiB [ ] CONTRIBUTING.md 4.0 KiB [ ] gyp_main.py 4.0 KiB [ ] /data 4.0 KiB [ ] AUTHORS 4.0 KiB [ ] README.md 4.0 KiB [ ] gyp 4.0 KiB [ ] CODE_OF_CONDUCT.md 4.0 KiB [ ] gyp.bat 4.0 KiB [ ] .flake8 4.0 KiB [ ] requirements_dev.txt ``` `gyp/pylib/gyp` ``` 620.0 KiB [##########] /generator 136.0 KiB [## ] xcodeproj_file.py 128.0 KiB [## ] input.py 80.0 KiB [# ] xcode_emulation.py 76.0 KiB [# ] MSVSSettings_test.py 56.0 KiB [ ] msvs_emulation.py 48.0 KiB [ ] MSVSSettings.py 32.0 KiB [ ] mac_tool.py 24.0 KiB [ ] __init__.py 24.0 KiB [ ] common.py 20.0 KiB [ ] MSVSVersion.py 16.0 KiB [ ] win_tool.py 16.0 KiB [ ] MSVSNew.py 12.0 KiB [ ] xcode_ninja.py 12.0 KiB [ ] MSVSUtil.py 8.0 KiB [ ] MSVSProject.py 8.0 KiB [ ] ninja_syntax.py 8.0 KiB [ ] MSVSUserFile.py 8.0 KiB [ ] easy_xml.py 4.0 KiB [ ] easy_xml_test.py 4.0 KiB [ ] input_test.py 4.0 KiB [ ] xml_fix.py 4.0 KiB [ ] common_test.py 4.0 KiB [ ] flock_tool.py 4.0 KiB [ ] MSVSToolFile.py 4.0 KiB [ ] simple_copy.py ``` `gyp/pylib/gyp/generator` ``` 148.0 KiB [##########] msvs.py 116.0 KiB [####### ] ninja.py 100.0 KiB [###### ] make.py 68.0 KiB [#### ] xcode.py 52.0 KiB [### ] android.py 52.0 KiB [### ] cmake.py 32.0 KiB [## ] analyzer.py 20.0 KiB [# ] eclipse.py 8.0 KiB [ ] compile_commands_json.py 4.0 KiB [ ] gypd.py 4.0 KiB [ ] dump_dependency_json.py 4.0 KiB [ ] ninja_test.py 4.0 KiB [ ] gypsh.py 4.0 KiB [ ] msvs_test.py 4.0 KiB [ ] xcode_test.py 0.0 B [ ] __init__.py ``` `gyp/tools` (The `emacs` subfolder here seems to be something you would install into your local `emacs` text editor config somehow, so not really relevant to `node-gyp`): ``` 224.0 KiB [##########] /emacs 16.0 KiB [ ] /Xcode 12.0 KiB [ ] pretty_vcproj.py 8.0 KiB [ ] pretty_sln.py 8.0 KiB [ ] pretty_gyp.py 4.0 KiB [ ] graphviz.py 4.0 KiB [ ] README ```
david-golightly-leapyear commented 3 years ago

This is more than a space-saving move. An automated security audit tool found and flagged the presence of https://github.com/nodejs/node-gyp/blob/master/test/fixtures/server.key which, although probably harmless, creates noise for the auditor and makes for an awkward conversation. Is there an estimated time to remove these files from the build?

rvagg commented 3 years ago

Here's my take: Tests etc. represent part of the documentation and are components of the complete project which we occasionally snapshot and publish as a version to npm. On that basis, packages in npm should include stand-alone snapshots of a project. I know this is not a view held by everyone who publises to npm and some want to obsessively remove everything except that which touches the execution path--which is arguably reasonable in a world of trivial dependencies that bloat our development folders. I just think maybe the problem of needless dependency tree bloat should be tackled with greater priority. So I'm personally not in favour of removing things from a publish unless we're dealing with unreasonably large items. 2Mb is not unreasonable for us to be able to ship a complete snapshot of the package. If we have extraneous things in the repo then we should be able to add them to .gitignore.

Re server.key, if this is a critical problem for you then perhaps you could help out by contributing a change that shuffles it out of view of low quality auto-audits? Either generate it as required for tests, or maybe more practically, embed it as a string in a test file that can be extracted and removed on each run? Alternatively find a better auditor that isn't upset by test fixtures?