DependencyTrack / dependency-track

Dependency-Track is an intelligent Component Analysis platform that allows organizations to identify and reduce risk in the software supply chain.
https://dependencytrack.org/
Apache License 2.0
2.66k stars 570 forks source link

Filter Projects Metrics by tag #238

Open msymons opened 5 years ago

msymons commented 5 years ago

Enhancement: In Dependency-Track v3.3.1, clicking on "projects" provides a summary of metrics at the top of the screen:

When clicking on a tag, the list of projects is filtered by the tag. Thus, in the attached screenshot one sees 3 out of the 6 projects in the portfolio.

As an enhancement, the metrics should be filtered by the tag. eg in this example...

metrics-per-tag

stevespringett commented 5 years ago

Doing realtime queries to filter this would introduce major performance issues. For a relational database, it would be required to add per-tag metrics and continuously store these along-side the existing portfolio, project, and dependency - but likely not the component metrics. Doing this would not introduce any realtime performance issues, but would introduce a bottleneck every hour when metrics are calculated. This obviously will vary depending on the number of tags used. If only a handful of tags are used, I don't think that would be an issue, but if hundreds or thousands of tags are used, this would impact delivery of all metrics.

While I agree that this feature request is valid and beneficial, I'll have to think about the best way to implement. Some colleagues at a large database vendor will be open sourcing some really interesting dependency modeling visualizations all based on a graph database (can't remember which one). There will likely be integration with Dependency-Track. But a feature like this could benefit tremendously from a graph, as could a lot of other data-related questions that could be asked.

msymons commented 5 years ago

@stevespringett, I have been thinking more about this. As tags only exist as a property of projects and not (say) components, then I struggle to think of a use-case that would use thousands of tags. mvnrepository.com has a tag cloud with 250 tags.

if "more tags = bigger performance hit", then perhaps it would help to first implement tag management. Make it easier for the administrator to stop tags getting out of control. For instance...

Want me to log a new enhancement for the above?

stevespringett commented 5 years ago

mvnrepository is not a good measure as its Java only and limited to only libraries and frameworks. It omits applications which have much greater functionality - and thus a wider range of tags. The Dependency-Track repo has 20 GitHub topics (tags) so that its easily discoverable. Thats a pretty big ratio.

One use case that I think will be the most widely used misuse case would be as an alternate identifier. An organization that has 10K applications (assets) in their enterprise would track each application and version. Most orgs assign a unique identifier for each asset in their inventory. If they use the tags feature as an alternate identifier so they could query for an asset based on its name/version or tag, then this is a legitimate misuse of the tags feature. It does what the org wants, but is misusing a feature in order to achieve it. In this case, the org should be using project properties, not tags.

Do you have an example of an application (screenshots) that does tag management?

msymons commented 5 years ago

Thanks for the use-cases.

So far, my usage of tags is really basic.

Although tags documentation is sparse, I saw them in a Dependency-Track screenshot and thought "that's got to be how to group together my 13 projects and then add more projects over time". For starters, I need to make absolutely sure that every single component that is a dependency of a project with the tag has valid license info (yes, a lot of manual updating will be needed). Then comes the report generation.

I have not yet thought in detail about how to generate a report that is useful for the legal folk, but have started to play with the REST API using a Swagger browser extension plus curl... and am scratching my head at the moment, wondering whether the requests I need are supported.

msymons commented 5 years ago

Some more thoughts... although these are not to do with metrics but simply filtering the raw data...

Looking at the REST API, we have relevent GET requests for BOMs:

/v1/bom/cyclonedx/project/{uuid} Returns dependency metadata for a project in CycloneDX format /v1/bom/cyclonedx/components Returns dependency metadata for all components in CycloneDX format /v1/bom/cyclonedx/component/{uuid} Returns dependency metadata for a specific component in CycloneDX format

It would be useful to be able to generate a BOM that filters by tag (a BOM from which duplicates have been removed). I argue that doing so would fulfil the BOM "Best Practices" documentation:

The ability for an organization to generate a complete bill-of-material during continuous integration is one of many maturity indicators. BOMs are increasingly required for various compliance, regulatory, legal, or economic reasons.

...where complete = "complete wrt customer x or bid-team y (etc)".

Currently one can only produce a single master listing (using ../cyclonedx/components) and that means including projects that are not relevent. Or one can produce dozens of separate BOMs (using ../cyclonedx/project/{uuid} - which means that there is a lot of duplication to wade through. ie, 22 occurrences of jackson-databind v2.9.7.

I guess that all this means that tags added to a project should be inherited by the components in that project. In v3.3.1, one can filter projects by tag within the UI. eg:

/projects/?tag=abc

...but it is not possible to filter components in the same way (useful for day-to-day usage of DT).

msymons commented 5 years ago

@stevespringett, I ran into a problem today that may be a tag use-case or may be a defect. You decide! =)

As explained previously, I have to comprehensive information on license usage for legal purposes. I might not be able to spit out a report - but I hoped to produce a consolidated BOM.

In order to work-around the problem I decided to delete the 2 projects in my DT v3.3.1 that were not part of the set that I need to report on (remembering also to delete Publish BOM steps in the respective Jenkins jobs).

This resulted in the following change to project metrics:

metrics-change-1

Relevent Dashboard Stats changed from:

Dependencies (total)    5120
Vulnerable Dependencies     173
Components (total)  1140
Vulnerable Components   112
Portfolio Vulnerabilities   522

to:

Dependencies (total)    4717
Vulnerable Dependencies     144
Components (total)  1140
Vulnerable Components   112
Portfolio Vulnerabilities   462

Note that the number of "Components (total)" is unchanged at 1140. That means two things...

1) The components are all listed in the Components tab in the UI. Inspecting one (that is from a deleted project) and clicking on "Projects" returns an empty list. 2) Using the REST API /api/v1/bom/cyclonedx/components gives the BOM for the 1140 components.

This last bit is a bit problematic. The BOM contains the desired "truth" for everything I need to know about, but is non-useful because of all the extra stuff - and it's going to be hard work to clean it up. I can generate separate BOMs for each project - but merging will require removing a lot of duplications (ie, components that are used in multiple projects).

I hope that this makes sense.

stevespringett commented 5 years ago

Yes, you've discovered the difference between a component and a dependency. A component becomes a dependency once a project 'depends' on a component. There are individual metrics for dependencies as well as a unique REST endpoint for dependencies.

The REST endpoint for dependencies does not provide the ability to list them all, because the definition of a dependency is (project + component). And a component may be used by multiple projects. So we don't provide a way to retrieve all dependencies.

However, I've been toying with the idea of adding a 'inuse' flag to the components API that would return only the components that are 'in use' (dependencies) by projects. This may be a partial solution to #197 where the components aren't actually purged but rather filtered out by default so the focus is only on the components that are 'in use'. There are legit usecases for tracking components that are not associated with a project, so I can't completely get rid of them.

msymons commented 5 years ago

The "in use" flag would definitely have fixed my query...

/api/v1/bom/cyclonedx/components

...reducing the output of 1140 components by +/- 200 (the projects I had deleted really are so totally different in component usage from the ones that remained).

Of course, I need to add back those deleted projects at some point (plus a couple of hundred new projects). So my workaround wont work for long, even after an "in use" flag is implemented.

I understand what you say about the dependencies endpoint and the reason that it works the way that it does. However, I think that still shows the utility of having a way to being able to use tags (or something similar) in the REST API in order to extract consolidated information.

iamrahul127 commented 2 years ago

Any plan to implement improvements related to tag managements?

stevespringett commented 2 years ago

@iamrahul127 Eventually, yes. However, there are plans to separate out metric updates into its own deployable service due to performance impact on the DT server. So it likely would not happen until its been broken out.

iamrahul127 commented 1 year ago

@stevespringett Sorry to trouble you again. Is there any progress in this case?

margusanvelt commented 8 months ago

This would be a good functionality to have.

iman4000 commented 3 months ago

it's a critical feature for med or big companies