Open andrewjschuang opened 3 months ago
I reported this issue via support, however it is labeled as a feature request here. I believe it unfortunately needs to be categorized as a major security issue in the Pipedream ecosystem.
Especially with how short-lived and sporadic some workflows are executed, no control over whether the code has network access, etc. it would be close to impossible to determine after the fact whether an attack has happened and what data was affected.
I'd love a statement on https://pipedream.com/docs/privacy-and-security as well as to the current design and what it means for data compliance as it seems to me as if steps/workflows have a very obvious supply chain issue that is opaque to the user (I, as a user of Pipedream can't actually see what the dependency graph / versions are executed at all, let alone control it).
Controlling the complete dependency graph of an npm dependency closure is crucial because it ensures security, stability, and reliability of the software by preventing malicious code, minimizing vulnerabilities, and avoiding unexpected disruptions due to changes or removals in dependencies.
Common problems with specifying a dependency without any version and/or without controlling the whole dependency graph are:
To give three prominent examples where this happened in the recent past on a massive scale:
In October 2021, versions of the popular npm package ua-parser-js were found to contain malicious code. This package, widely used for parsing user-agent data, was compromised to include malware that allowed attackers to gain control over infected systems and steal sensitive information.
The event-stream
package was tampered with to include malicious code via the flatmap-stream
dependency, targeting cryptocurrency wallets like Copay to steal Bitcoin.
The left-pad
package was unpublished by its author, causing widespread disruption in the JavaScript ecosystem due to its use as a dependency in many projects.
(this inicdent should not reoccur as npm now made it impossible to unpublish public packages that are a dependency, but it shows nicely how a super simple and mundane leaf-package of the dependency tree caused massive disruption)
Whilst I think that fixing this issue out of the box is crucial, one possible (but mildly tedious) workaround comes to mind:
For each package used in any Pipedream workflow, create a proxy package with a package-lock.json
, e.g. if you depend on axios
you could create @my-scope/axios
, define axios
as the dependency, reexport all symbols and lock the graph. If you then refer to this package with a fixed version @my-scope/axios@0.0.1
theoretically it should only pull in the locked graph.
There are a few assumptions being made:
npm
in their production environmentimport
would automatically pull in the dependency with the pipedream logic. So creating one big proxy package with all versions and them importing the proxied package directly would not guarantee the version from the locked proxy package being used (depending on how dependency management in workflows is implemented)
Describe the solution you'd like I would like to provide a package.json and package-lock.json file with the following behavior:
package.json
is supplied: Pipedream should install the NPM packages exactly as specified in mypackage.json
file (equivalent to running npm install).package-lock.json
is supplied: Pipedream should install the NPM packages precisely as locked in mypackage-lock.json
file (equivalent to running npm ci).If I supply
package.json
orpackage-lock.json
, I don't want Pipedream to automatically install latest packages from theimport
declarations in my Node.js steps.