Closed JamieMagee closed 3 years ago
Thanks for this suggestion @JamieMagee -- those runtime improvements are awesome! Do you know the general availability of jq
in most standard node/npm base images? That would be the only issue I could see using jq to parse instead of npm. In the mean time, let me get to work on a potential PR to incorporate and test this. Thanks!
Unfortunately, it doesn't look like jq
is available as a default utility in the node base image and if it's not guaranteed to be available, we can't use it for our parsing. When I mount the base image using Tern's debug
functionality:
(ternenv) root:# cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.11.6
PRETTY_NAME="Alpine Linux v3.11"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://bugs.alpinelinux.org/"
(ternenv) root:# jq
/bin/sh: jq: not found
I agree that a run time of 25-30 minutes for npm package parsing is not helpful. We'll have to think about how to address this better. If you have any ideas, let me know.
Yep, I realised soon after I opened this issue, that jq isn't a usual component of node based images.
However, I have an alternative that works on node:alpine
.
And best of all, it's still fast 🎉
real 0m 0.77s
user 0m 0.70s
sys 0m 0.07s
I also attempted importing npm
, and calling it that way, but with the latest major release of npm (v7) they changed they signature of the list
method. Given we can't know ahead of time the version of npm
, it's easier to use child_process
, and call npm
that way.
Here's a gist with the TypeScript code.
@JamieMagee Instead of javascript, another thought would be to use the host system to parse npm packages (similar to your proposal for distroless) and add jq as a requirement of Tern. jq
has no runtime dependencies so we wouldn't be introducing a long dependency chain.
Sure, that also works for me. In that case we'll have to wait for #889 to land, as it has support for running commands on the host machine.
@JamieMagee Now that #889 has been merged, do you have time to take this up or do you want me to take a look?
I was planning to take a look tomorrow. I'll reach out if I hit any blockers or need some pointers.
Describe the Feature The current implementation for parsing npm packages is rather slow. This is due to using npm directly, which has the overhead of spinning up Node.
Instead, Tern should get the complete output using
npm list -g --json --long --depth=1000
and parse that output using jq.Use Cases In tests, run by @jcfiorenzano, we saw runtimes of 23-25 minutes for the following Dockerfile
Implementation Changes
The current commands used to scan npm here:
https://github.com/tern-tools/tern/blob/66822ef16cb09d45db929b453a5d9f04d0a5e838/tern/analyze/default/command_lib/base.yml#L384-L410
Should be replaced by equivalent jq commands. A sample jq command is
With a significantly smaller runtime: