lirantal / lockfile-lint

Lint an npm or yarn lockfile to analyze and detect security issues
Apache License 2.0
776 stars 35 forks source link

Feat: Validate missing integrity fields #196

Open bozdoz opened 1 month ago

bozdoz commented 1 month ago

Is your feature request related to a problem? Please describe. Please describe the problem you are trying to solve.

I have lockfiles that occasionally get their integrity fields removed and npm installs the packages regardless (missing integrity is OK). Also, I see you have a test that succeeds when there is a missing integrity: https://github.com/lirantal/lockfile-lint/blob/main/packages/lockfile-lint-api/__tests__/validators.integrityHashType.test.js#L68

Describe the solution you'd like

I want to be able to catch these issues and fix them, to guarantee that the packages we download are the same; this is in-line with this open npm-cli issue: https://github.com/npm/cli/issues/4460#issuecomment-1050025391

Describe alternatives you've considered

I would just write my own script to basically do the same parsing as this library :)

lirantal commented 3 weeks ago

The integrity validation checks that the integrity hash is of specific type (sha512) and not whether it exists or not:

--validate-integrity, -i - validates the integrity field is a sha512 hash

We could perhaps add a new CLI flag like --validate-integrity-strict which will require that all deps have an integrity field and fail if not. It's worth to note that not all dependency sources have integrity attached. For example, a Git-sourced dependency doesn't:

"metalsmith-permalinks": {
      "version": "github:XhmikosR/metalsmith-permalinks#432843d5823a292b2e47397ba46fd761d03eb9d3",
      "from": "github:XhmikosR/metalsmith-permalinks#master",
      "requires": {
        "debug": "^4.1.1",
        "moment": "^2.24.0",
        "slugify": "^1.3.5",
        "substitute": "https://github.com/segment-boneyard/substitute/archive/0.1.0.tar.gz"
      },
      "dependencies": {
        "substitute": {
          "version": "https://github.com/segment-boneyard/substitute/archive/0.1.0.tar.gz",
          "integrity": "sha512-2ccHCDdgHt0ZrXAFM/7G/3zEwMhsUfOKfwzSAv2vqO2JQcPpNAlp10e8F5uMa6zpGFYRrdy7BGLvBZsthHrLag=="
        }
      }
    },

or another example:

{
  "name": "babas",
  "version": "1.0.0",
  "lockfileVersion": 1,
  "requires": true,
  "dependencies": {
    "debug": {
      "version": "4.1.1",
      "resolved": "https://registry.npmjs.org/debug/-/debug-4.1.1.tgz",
      "integrity": "sha512-pYAIzeRo8J6KPEaJ0VWOh5Pzkbw/RetuzehGM7QRRX5he4fPHx2rdKMB256ehJCkX+XRQm16eZLqLNS8RSZXZw==",
      "requires": {
        "ms": "^2.1.1"
      }
    },
    "ms": {
      "version": "git+https://github.com/zeit/ms.git#adf1eb282d29fe3c405d205a3854177b86a97c1f",
      "from": "git+https://github.com/zeit/ms.git#master"
    }
  }
}

Do you still see value in adding a strict integrity field? If so, how would you validate it? simply that it exists doesn't provide any extra trust because it can have some dummy value.

bozdoz commented 3 weeks ago

It looks like integrity should be included for all dependencies other than "bundled dependencies": https://docs.npmjs.com/cli/v6/configuring-npm/package-lock-json#integrity

v10 says similar:

integrity: A sha512 or sha1 Standard Subresource Integrity string for the artifact that was unpacked in this location. For git dependencies, this is the commit sha.

https://docs.npmjs.com/cli/v10/configuring-npm/package-lock-json

Monorepos/workspaces might be the outstanding question mark for me, but I think this can be resolved by checking for:

  1. "node_modules" in name, should have integrity
  2. "linked": true, should not have integrity
  3. "version" starts with "file:", should not have integrity

Testing a monorepo, those are the three indicators that I saw:

3rd party (node_modules, and not linked, and version doesn't start with "file:"):

"node_modules/typescript": {
      "version": "5.4.5",
      "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.4.5.tgz",
      "integrity": "sha512-vcI4UpRgg81oIRUFwR0WSIHKt11nJ7SAVlYNIu+QpqeyXP+gpQJy/Z4+F0aGxSE4MqwjyXvW/TzgkLAx2AGHwQ==",
      "bin": {
        "tsc": "bin/tsc",
        "tsserver": "bin/tsserver"
      },
      "engines": {
        "node": ">=14.17"
      }
    }

mono repo package (v3) (has link: true):

"node_modules/@my-scope/eslint-config": {
            "resolved": "packages/components/EslintConfig",
            "link": true
        },

duplicate entry (v3) (doesn't have "node_modules"):

"packages/components/EslintConfig": {
            "name": "@my-scope/eslint-config",
            "version": "7.3.0"
}

mono repo (v2) (version starts with "file:"):

"@my-scope/eslint-config": {
            "version": "file:packages/components/EslintConfig"
}

Also, if you wanted the workflow to remove the integrity and resolved fields:

mkdir test-lockfile
cd test-lockfile
npm init -y
npm i typescript
rm package-lock.json && rm node_modules/.package-lock.json
npm i

you should get a lockfile like this:

{
  "name": "test-lockfile",
  "version": "1.0.0",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "test-lockfile",
      "version": "1.0.0",
      "license": "ISC",
      "dependencies": {
        "typescript": "^5.4.5"
      }
    },
    "node_modules/typescript": {
      "version": "5.4.5",
      "license": "Apache-2.0",
      "bin": {
        "tsc": "bin/tsc",
        "tsserver": "bin/tsserver"
      },
      "engines": {
        "node": ">=14.17"
      }
    }
  }
}

I think we've seen this happen accidentally a few times, probably due to moving from lockfile v2 to v3.

lirantal commented 3 weeks ago

The point is that there are cases in which packages that get installed don't have an integrity field and if we have a strict check for it then it's going to always fail. One example I gave before is when you install packages from git.

For example if you installed the ms package from the github repository directly like this:

npm install --save https://github.com/vercel/ms

And you inspect the lockfile you'll see that it doesn't have an integrity field:

image

I'm not saying this is a blocker. We can choose to ignore git repos for the strict integrity check. What I'm saying is that we need to map out all of these cases. Do you want to check what they are and add here?