Inedo / pgscan

Dependency scanner for ProGet.
MIT License
11 stars 5 forks source link

Support solution folders #28

Closed crotondo-dap closed 1 year ago

crotondo-dap commented 1 year ago

Hi,

we are working with solutions which contain several projects grouped into different collections using solution folders (Visual Studio). What we want to do is to call pgscan for a specific solution folder only. Because sometimes such solution folders contain projects for a certain tool, and we are interested in the dependencies of this tool. Calling pgscan for the solution would be too much, and calling it for a single .csproj wouldn't be enough.

Let me know if you would be able to support solution folders.

rhessinger commented 1 year ago

Hi @crotondo-dap,

The easiest way would probably be to write a script that scans for .csproj files and/or package-lock.json files and have it call pgscan a number of times.

Here is an example PowerShell script to run pgscan on each csproj in a directory:

Get-ChildItem -Path "c:\path\to\solution" -Filter *.csproj -Recurse -ErrorAction SilentlyContinue -Force | ForEach-Object {& pgscan identify --input="$_.FullName" --proget-url=http://proget.myserver.com --version=1.0.0 --project-name=MyProject --type=nuget --api-key=apikey1234 }

Please note that you will need to upgrade to ProGet 2022.23 or later for the support of running pgscan multiple times against the same ProGet project's release. Prior to that version, the subsequent runs would not add newly detected dependencies.

This is also something I will discuss with our products team to see if we can come up with a good solution to include directly in pgscan.

Thanks, Rich

crotondo-dap commented 1 year ago

Hi @rhessinger,

what if I am not able to select the necessary .csproj-files using wildcard filters? Lets assume that I have a solution that is physically structured like this: image I have a solution and some folders containing .csproj-files of the same name. But within Visual Studio, I use solution folders to structure my projects virtually: image If I now want to use pgscan to gather the information for "Tool1" for example. How would I do that? Physically there is no folder that contains the .csproj-files of Tool1. But the .sln-file containes the information about what project is belonging to what virtual folder. Further, it is possible to include projects from every location - they do not have to be within the solution folder.

This is just a small example, and it would be possible to create explicit pgscan calls. But our solution folders can become quite big and change a lot. Therefore, it would mean a lot of effort to adjust all pgscan calls accordingly.

Since pgscan is already parsing the .sln-file for projects, it would be an improvement to support solution folders.

rhessinger commented 1 year ago

Hi @crotondo-dap,

Thanks for the explanation. How often do projects get added and removed from this solution? I think your best bet would be to write a script that just calls pgscan on each project you want to scan explicitly (and in the order you want) and just have each scan push to the same project and release.

I'll talk with our team about adding solution folder filters to pgscan, but whatever solution we come up with will need to work for npm and python as well. Our implementation of pgscan has always been geared at being a lightweight alternative to CycloneDX with direct ProGet integration, so the complexity of adding these filters will also play into that decision.

Thanks, Rich

szimmer-dap commented 1 year ago

Hi there!

Let me add some more detail on why we believe we need this feature. We recently noticed that the SBOM files (or the corresponding reports in ProGet, to be more accurate) for some of our products would include packages that we can not redistribute. Looking into this we noticed that those packages we referenced by unit tests or internal tools (which is fine), not by the product itself. That's why we figured we need to separate SBOM files: one for the actual product and one for everything around it.

As we usually have one solution per product to build everything, we figured that we needed a way to filter the projects. Solution folders seemed like a natural way to do this, as this would require very little change on our build infrastructure. To give you an idea: we currently have 200+ "Projects"/products listed in ProGet's SCA feature. Not every "Project" is an actual product, but there is a build pipeline including a call of pgscan for every one of them. The solution file of the biggest of those 200+ projects currently contains of 400+ csprog files. I'm positive that any wildcard filter we could come up with would have a 99.9% chance to yield incorrect results, and listing every single project by hand simply would not be feasible. Plus, our build pipelines are usually maintained by a different team than the actually project team, so there would always be a gap there. In terms of maintainability and comprehensibility, solution folders seemed like our best bet.

We actually considered implementing this ourselves (we have made adjustments to pgscan before and offered them as pull requests), we just figured we'd ask first to see what you guys think about the idea. If we were to implement this at some point, would you be interested to merge it to the main repository?

rhessinger commented 1 year ago

Hi @szimmer-dap,

We are not opposed to the idea. If you want to submit a pull request for adding it, we will definitely review it.

Thanks, Rich

apxltd commented 1 year ago

@szimmer-dap wow, that's a lot of projects!

In my experience, trying to generalize handling of complex, mega-solutions like that can be really difficult.

It's worth noting that there is an Inedo.DependencyScan library that pgscan uses, and you may find that building a tool just for your mega-solution is a better way to go, so you can capture the complex relationship between projects etc.

No idea if this is helpful, but how we use Inedo.DependencyScan inside of BuildMaster. We don't run pgscan because we prefer having better logging, and want to be able to count unstable dependencies.

            this.LogDebug("Beginning dependency scan....");
            var scanner = DependencyScanner.GetScanner(context.ResolvePath(this.ProjectFilePath!), this.ScanType, new RemoteFileSystem(await context.Agent.GetServiceAsync<IFileOperationsExecuter>()));
            var projects = await scanner.ResolveDependenciesAsync();

            int depends = 0, unstables = 0;
            if (projects.Count > 0)
            {
                var query = (from p in projects
                             from d in p.Dependencies
                             group d by (d.Group, d.Name, d.Version) into grouped
                             select grouped);
                foreach (var g in query)
                {
                    depends++;
                    if (g.Key.Version.Contains('-'))
                        unstables++;
                }

                this.LogInformation($"Found {depends} dependencies across {projects.Count} projects.");
                if (depends == 0)
                    this.LogWarning("No dependencies were detected. Make sure that you're using the right project file, and are running this operation after building (e.g. dotnet build/publish)  or restoring (e.g. pip install)");

                this.LogDebug($"Publishing to ProGet project \"{this.ProjectName}\" (Release \"{this.ProjectVersion}\")...");
                var client = new ProGetClient(this.ProGetUrl);
                await client.PublishSbomAsync(
                    projects,
                    new PackageConsumer { Name = this.ProjectName, Version = this.ProjectVersion },
                    "application",
                    scanner.Type.ToString().ToLowerInvariant(),
                    this.ApiKey
                );

                this.LogInformation("Dependencies successfully published.");
            }
            else
            {
                this.LogWarning("No projects found.");
            }

It's basically just DependencyScanner.GetScanner().ResolveDependenciesAsync() to get the list of projects/dependencies based on files, and then ProGetclient...PublishSbomAsync to generate and upload the sbom file.

szimmer-dap commented 1 year ago

Hi there!

We just added a pull request that would add support for solution folders. Feel free to have a look at it and let us know what you think of it.

@apxltd: To avoid confusion: the mega solution I mentioned in my previous post is for one product (it's obviously our biggest product). In general, each product has its own solution, and pgscan does a good job producing SBOMs for them. It's just that we want to be able to differentiate between those projects within a solution that are actually part of the product, and those that are part of unit tests or related tools. The reason we need this is that we have some "unwanted" licenses popping up in our SCA reports, and upon taking a closer look at the issue we found that the libraries in question are not used by the product itself, but by an internal tool (which is OK). By putting the respective projects in corresponding solution folders we can avoid pgscan reporting libraries that are not part of the actual product.

gdivis commented 1 year ago

@szimmer-dap thanks! We'll review and get back to you shortly!

gdivis commented 1 year ago

Merged! I like the refactoring, and the solution folder handling looks general enough to me to be potentially useful to others.

szimmer-dap commented 1 year ago

Hi @gdivis, glad you liked it. I guess this issue can be closed then. Are you planning to publish a new release of pgscan?

gdivis commented 1 year ago

@szimmer-dap just released now as v1.5.0.