duniul / clean-modules

🧹 Clean up/prune unnecessary files and reduce the size of your node_modules directory.
ISC License
107 stars 6 forks source link

Autoreport detected issues to offending repositories #34

Open mcmxcdev opened 7 months ago

mcmxcdev commented 7 months ago

First of all, this is an amazing tool! It broke my Sveltekit application due to some commands for the dev server being dependent on *.md files, but it did what it's supposed to (removed 100+ MB) and worked without any issues.

I generated a clean-modules-result.json and would probably start reporting issues to GitHub repos, but then had the idea that there must be some way to do this automatically. At least for findings with 100% certainty that they shouldn't be in node_modules e.g. coverage or test files.

Did you consider this yet?

duniul commented 7 months ago

Glad you like the tool!

I'm not sure I understand your question correctly, do you have an example of how it would be used? It's not really possible for clean-modules to determine whether or not removing a file breaks the dependency or not 💭

mcmxcdev commented 7 months ago

Sorry for the misunderstanding. The first paragraph was solely about my great experience with the tool, not a request in any shape or form!

My question was more around a specific functionality: clean-modules-result.json gives back an useful list of issues with repos, but it's manual effort to report these issues to repo maintainers (which I did for some yesterday).

I would imagine that clean-modules could integrate with the GitHub API to create draft tickets in offending repos (which should be selectable/filterable from a list based on clean-modules-result.json) and prepare a text template. This way, users could easily help with cleaning up the JS ecosystem. It's probably a lot of effort to make this work well and not sure if realistic, but just wanted to put my thoughts down here.

sdavids commented 1 month ago

The idea sounds cool ... but:

create draft tickets in offending repos

GitHub does not have a draft ticket concept; only draft PRs.

https://docs.github.com/en/rest/issues/issues?apiVersion=2022-11-28#create-an-issue

Not every NPM package uses GitHub.


Blindly opening issues would lead to spam and I would wager that a lot of repo owners do not want spam/duplicate issues.


One possibility would be to just list bugs fields.

Something akin to jq -r .bugs.url package.json

Either with a new command or as an additional (optional) field in clean-modules analyze.

[
  {
    "filePath": "/private/tmp/v/node_modules/abort-controller/LICENSE",
    "bugUrl": "https://github.com/mysticatea/abort-controller/issues",
    "includedByDefault": true,
    "includedByGlobs": [
      {
        "original": "licen@(s|c)e",
        "derived": "/private/tmp/v/node_modules/**/licen@(s|c)e"
      }
    ]
  },
...
    {
    "filePath": "/private/tmp/v/node_modules/no-bugs-set/LICENSE",
    "bugUrl": null,
    "includedByDefault": true,
    "includedByGlobs": [
      {
        "original": "licen@(s|c)e",
        "derived": "/private/tmp/v/node_modules/**/licen@(s|c)e"
      }
    ]
  },
...
$ clean-modules bug-urls --only-urls
https://github.com/some-owner/some-repo/issues
https://gitlab.com/some-owner/some-repo/-/issues
$ clean-modules bug-urls
https://github.com/some-owner/some-repo/issues

Something to paste into the issue template

--

https://gitlab.com/some-owner/some-repo/-/issues

Something to paste into the issue template
$ clean-modules bug-urls --json
[
  {
    "packageName": "some-package",
    "bugsUrl": "https://github.com/some-owner/some-repo/issues",
    "bugText": "Something to paste into the issue template"
  },
  {
    "packageName": "some-other-package",
    "bugsUrl": "https://gitlab.com/some-owner/some-repo/-/issues",
    "bugText": "Something to paste into the issue template"
  },
]

If one is using a good terminal application one can click on the URLs in the output.

mcmxcdev commented 1 month ago

Yeah, my idea definitely wasn't well-thought-out!

Including the bugs URL by default in the analyze output sounds reasonable to me.

duniul commented 1 month ago

How about changing the output of analyze to be grouped by package by default?

{
  "node_modules/.pnpm/chai@5.1.1/node_modules/chai": {
    "package": {
      "name": "chai",
      "version": "5.1.1",
      "repository": {
        "type": "git",
        "url": "https://github.com/chaijs/chai"
      },
      "homepage": "http://chaijs.com",
      "bugs": {
        "url": "https://github.com/chaijs/chai/issues"
      }
    },
    "files": [
      {
        "filePath": "/Users/daniel/Projects/private/clean-modules/node_modules/.pnpm/chai@5.1.1/node_modules/chai/CODE_OF_CONDUCT.md",
        "includedByDefault": true,
        "includedByGlobs": [
          {
            "original": "*.@(md|mkd|markdown|mdown)",
            "derived": "/Users/daniel/Projects/private/clean-modules/node_modules/**/*.@(md|mkd|markdown|mdown)"
          }
        ]
      }
    ]
  },
  "node_modules/.pnpm/check-error@2.1.1/node_modules/check-error": {
    "package": {
      "name": "check-error",
      "version": "2.1.1",
      "repository": {
        "type": "git",
        "url": "git+ssh://git@github.com/chaijs/check-error.git"
      }
    },
    "files": [
      {
        "filePath": "/Users/daniel/Projects/private/clean-modules/node_modules/.pnpm/check-error@2.1.1/node_modules/check-error/LICENSE",
        "includedByDefault": true,
        "includedByGlobs": [
          {
            "original": "licen@(s|c)e",
            "derived": "/Users/daniel/Projects/private/clean-modules/node_modules/**/licen@(s|c)e"
          }
        ]
      }
    ]
  }
}

Then it would avoid repeating the bugs URL for each file and make it a bit easier to overview.

The flat file output could still be kept around behind a --flat flag or such.

sdavids commented 1 month ago

Sounds good.

That would be a breaking api change though…

duniul commented 1 month ago

Yeah, I guess the other way around and adding --package or --group-by=package or similar would be better to avoid a breaking change.