Open ledsun opened 3 years ago
@ledsun thanks for the report! I'll take a look when I'm able - is this an urgent issue for you?
I'm in no hurry. I have a workaround.
I assume that the cause of this issue is that yarn workspaces creates symbolic links in the node_modules directory.
~ ls -lah node_modules/
total 8
drwxr-xr-x 17 shigerunakajima staff 544B 3 30 18:06 .
drwxr-xr-x 8 shigerunakajima staff 256B 3 30 17:39 ..
drwxr-xr-x 5 shigerunakajima staff 160B 3 30 18:06 .bin
-rw-r--r-- 1 shigerunakajima staff 2.4K 3 30 18:06 .yarn-integrity
drwxr-xr-x 7 shigerunakajima staff 224B 3 30 18:06 cross-env
drwxr-xr-x 9 shigerunakajima staff 288B 3 30 18:06 cross-spawn
drwxr-xr-x 6 shigerunakajima staff 192B 3 30 18:06 is-windows
drwxr-xr-x 10 shigerunakajima staff 320B 3 30 17:37 isexe
drwxr-xr-x 6 shigerunakajima staff 192B 3 30 18:06 lru-cache
drwxr-xr-x 7 shigerunakajima staff 224B 3 30 17:45 path-key
drwxr-xr-x 8 shigerunakajima staff 256B 3 30 18:06 pseudomap
drwxr-xr-x 6 shigerunakajima staff 192B 3 30 18:06 shebang-command
drwxr-xr-x 6 shigerunakajima staff 192B 3 30 18:06 shebang-regex
drwxr-xr-x 8 shigerunakajima staff 256B 3 30 18:06 which
lrwxr-xr-x 1 shigerunakajima staff 14B 3 30 18:06 workspace-a -> ../workspace-a
lrwxr-xr-x 1 shigerunakajima staff 14B 3 30 18:06 workspace-b -> ../workspace-b
drwxr-xr-x 7 shigerunakajima staff 224B 3 30 18:06 yallist
I think licensed is getting the directory under node_modules in the next line:
https://github.com/github/licensed/blob/2.15.1/lib/licensed/sources/yarn.rb#L79
But Dir.glob(config.pwd.join("node_modules/**/package.json"))
does not follow symbolic links.
My workaround is a patch to replace this with Dir.glob(config.pwd.join('node_modules/**{,/*/**}/package.json'))
.
@ledsun
I assume that the cause of this issue is that yarn workspaces creates symbolic links in the node_modules directory. My workaround is a patch to replace this with Dir.glob(config.pwd.join('node_modules/*{,//**}/package.json')).
👍 that makes sense. That would be an easy patch to put in but I'd like to learn more about how you'd expect licensed to work with them before adding that as a general fix.
I'm also not super up to speed on yarn workspaces themselves. I've read a few docs on them and they appear to be a way to manage dependencies among multiple projects/packages in a monorepo setup when one project takes a dependency on another project in the repo... is that correct?
I have a concern that a package with the same name and version but different licensing or other metadata might exist in different locations under node_modules
which would make the logic for the yarn source indeterminate. Whichever duplicate name-version
package is parsed last would be used for all projects. I'm not sure how realistic this scenario is but I think it might be possible if a package is forked and/or multiple package registries are used.
Can you set up a reproduction environment for the problem, removing any sensitive and non-public content? I'm wondering whether building knowledge about workspaces into the yarn source would be more appropriate. Something like enumerating workspaces, then changing dependency_paths
to make use of them like Dir.glob(config.pwd.join("node_modules/**{,/${workspace}/**}/package.json"))
Sorry. I can't provide a reproducible environment.
I'm running into this as well, specifically with package paths specified in nohoist
. Here's a reproduction repo and licensed results.
Site note: I didn't want to store cached licenses in my repo, so each push runs licensed cache
and licensed status
. Hoping that's expected usage when not storing cached files.
@jonabc Please let me know if that helps or if I can provide anything else related to this!
@jgeurts :wave: sorry I've been pretty busy and haven't had a chance to take a look. I'll try to get to it within the next 24 hrs
@jguerts thanks for the example repo. the repo along with some additional yarn documentation for the nohoist
syntax helped me wrap my head around the problem. This is an interesting problem space in that the workspaces functionality is solving for monorepos with multiple built packages. If each of these packages are distributed and used separately they should probably be tracked as individual "apps" from licensed's POV. I'm thinking the best way to handle this is
.licensed.yml
and package.json
get out of sync. automagically handling workspaces detected from package.json
will require some refactoring of how licensed apps are detected. I'll have to think on this one a bit more and see if there's an ideal solutionOff the top of my head this sounds like a fairly significant chunk of work, and as I mentioned things are pretty busy for me at the moment. Is this blocking you from using licensed and something you need fixed ASAP?
Thank you for looking into this!
It’s a blocker for me, but I’ll try working around it by removing the nohoist part of package.json before installing npms and running licensed. That should force all npms to install to the root. I’ve been holding off on that in case this was a relatively easy/quick update to licensed.
I tried removing the nohoist
part of the package.json file, but licensed failed to get paths for a couple dependencies still. So, I ended up "manually" doing your suggestion from #1:
npm install
in each workspace directorynpm
source type with each workspace as an explicit licensed app. That mostly worked, but I had some troubles with licenses for some of the dependencies not coming through accurately. I'll open a separate issue for that, if I can get it reproducing.
@jgeurts FYI I've started looking at a fix. There is some complexity in the solution but I'm hopeful to have this fixed and able to put out a release in the next few days.
@jgeurts I've hit a wall in how to get useful information out of yarn's CLI that perhaps you might be more knowledgable on?
When I call yarn list
from a package defined as a yarn workspace, it still gives full information on dependencies used by everything in the repo, and not information specific to that workspace. That means that licensed has no way to distinguish which dependencies are specific to that workspace package or not. That is pretty consequential if someone is publishing individual packages out of a monorepo - it would not be correct to say that every dependency from the repo applies to all packages.
Even worse is that, at least with the version of yarn I have locally (1.22.10), yarn's hoisting mechanism is making packages available to workspaces that don't call them out as dependencies 😢 . I am able to use packages listed as a dependencies
of workspace A inside of files in workspace B without any direct dependency linkage of workspace A -> B.
TBH I'm not sure how to support yarn workspaces given that I can't accurately tell what dependencies a project uses. For a tool that's purpose relates to licensing compliance I tend to err towards the safer option and am inclined to explicitly deny support for workspaces. I'll try with yarn v2 tomorrow and see if maybe they've fixed things in later versions.
If you have any ideas on how to support this I'd be very interested in hearing them 😆
@jonabc This seems to also be an issue with npm workspaces:
Results:
# npm install --ignore-scripts --no-audit --production
added 65 packages in 3s
# npm ls callbackify
@grano/mygrano-serverless@1.0.0 /Users/ville.lahdenvuo/Documents/Grano/repot/mygrano-serverless
└─┬ @grano/antivirus@1.0.0 -> ./antivirus
└── callbackify@1.1.0
...
Caching dependency records for antivirus
* Errors:
* antivirus.npm.callbackify
- missing: callbackify@^1.1.0, required by @grano/antivirus@1.0.0
* antivirus.npm.dotenv
- missing: dotenv@^16.0.0, required by @grano/antivirus@1.0.0
* antivirus.npm.es6-promisify
- missing: es6-promisify@^7.0.0, required by @grano/antivirus@1.0.0
* antivirus.npm.superagent
- missing: superagent@^7.1.1, required by @grano/antivirus@1.0.0
I assume it's because npm de-dupes dependencies in the root node_modules folder, but licensed does not traverse into parent folders.
For people having problems with npm workspaces I added this script before running licensed to work around the issue:
echo "::group::Workspaces workaround"
WORKSPACES=$(jq -r '.workspaces // [] | join(" ")' package.json)
if [[ -n "$WORKSPACES" ]]; then
for package in $WORKSPACES; do
echo "Copying node_modules to $package"
rsync -r --exclude ".bin" --exclude "@my-org/*" ./node_modules/ "./$package/node_modules"
done
fi
echo "::endgroup::"
thanks for the repro - taking a look!
@villelahdenvuo in this case, the errors being reported by licensed are actually coming from npm. I set up a quick repro for the above setup and got the same errors running npm list
from the individual workspace.
There's an issue that was opened on the npm cli about this where npm ls
run from a workspace isn't aware that it's dependencies are installed elsewhere. It looks like the latest versions of npm (8.5.0+) enable automatic workspace root detection which will solve this with a little extra logic in licensed to strip the current workspace from the list of detected dependencies
Are you able to upgrade to using npm 8.5+? I'd like to make that a requirement if possible for using npm workspaces because it allows the closest parity to existing licensed usage without requiring any additional configuration options.
@jonabc Yeah I'm already using 8.5+ locally, I just need to make sure it's being used on the Github Actions also. Thanks for the investigation!
I am using two workspaces. And each workspace depends on the same package with different versions. In this case, the package is installed under the workspace directory.
This is the output of the
yarn list
:In this case, licensed is unable to get the license information of the packages installed under the workspace directory. The following is the output of the
licensed cache
: