clj-holmes / clj-watson

clojure deps SCA
Eclipse Public License 2.0
77 stars 8 forks source link

Consider Providing Feedback During Nvd Db Download/Update #68

Closed lread closed 1 month ago

lread commented 1 month ago

Currently

There is no feedback during the download of the nvd database.

Example output:

Downloading/Updating database.
Download/Update completed.

But..

Downloads can take a few to several minutes; and recently much longer due to some issues. So it would be nice to get some idea that work is being done and how far along that work is.

Inspiration

For inspiration, nvd-clojure emits feedback like so while downloading:

[2024-07-29 22:21:48.254] INFO Engine - Checking for updates
[2024-07-29 22:21:49.951] INFO NvdApiDataSource - NVD API has 258,341 records in this update
[2024-07-29 22:21:55.182] INFO NvdApiDataSource - Downloaded 10,000/258,341 (4%)
[2024-07-29 22:22:01.305] INFO NvdApiDataSource - Downloaded 20,000/258,341 (8%)
[2024-07-29 22:22:06.300] INFO NvdApiDataSource - Downloaded 30,000/258,341 (12%)
[2024-07-29 22:22:12.480] INFO NvdApiDataSource - Downloaded 40,000/258,341 (15%)
...
[2024-07-29 22:24:16.209] INFO NvdApiDataSource - Downloaded 240,000/258,341 (93%)
[2024-07-29 22:24:24.486] INFO NvdApiDataSource - Downloaded 250,000/258,341 (97%)
[2024-07-29 22:24:34.249] INFO NvdApiDataSource - Downloaded 260,000/258,341 (101%)
[2024-07-29 22:24:34.267] INFO NvdApiDataSource - Downloaded 258,341/258,341 (100%)

And more feedback while applying the updates:

[2024-07-29 22:24:34.268] INFO NvdApiDataSource - Completed processing batch 1/130 (1%) in 3,202ms
[2024-07-29 22:24:34.268] INFO NvdApiDataSource - Completed processing batch 2/130 (2%) in 3,841ms
[2024-07-29 22:24:34.268] INFO NvdApiDataSource - Completed processing batch 3/130 (2%) in 3,481ms
[2024-07-29 22:24:34.268] INFO NvdApiDataSource - Completed processing batch 4/130 (3%) in 3,445ms
[2024-07-29 22:24:34.268] INFO NvdApiDataSource - Completed processing batch 5/130 (4%) in 1,382ms
...

And closes with:

[2024-07-29 22:24:34.508] INFO KnownExploitedDataSource - Updating CISA Known Exploited Vulnerability list: https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json
[2024-07-29 22:24:34.929] INFO CveDB - Begin database defrag
[2024-07-29 22:24:37.901] INFO CveDB - End database defrag (2975 ms)
[2024-07-29 22:24:37.903] INFO Engine - Check for updates complete (169649 ms)

I haven't verified yet, but it seems like nvd-clojure is just using existing logging from depedency-check.

Proposal

Let's look at our options for providing more feedback here and do something about it. Perhaps using dependency-check logging is the best way, but it's probably worth a quick glance to see if there are other ways.

For example, the status of applying the updates is on the noisy side for something that happens so quickly.

Next steps

I am happy to discuss further, and if/when we agree, I can create a PR.

lread commented 1 month ago

Oh yeah, it is just simple logging from dependency-check.

There are other interesting/important things dependency-check logs, so turning on info-level logging would make sense, at least for certain packages. (Clj-watson currently disables all dependency-check logging via org.slf4j/slf4j-nop

seancorfield commented 1 month ago

Sounds like a reasonable approach to enable logging (although it's going to make it real chatty!).

Do you think a way to set the logging level should be provided, so folks can reduce the chattiness if they don't care?

lread commented 1 month ago

Thanks Sean,

Do you think a way to set the logging level should be provided, so folks can reduce the chattiness if they don't care?

Yeah, we should probably think about this and allow for this while adding in our logging configuration defaults. But we probably don't need to document how to change logging config until folks ask for it?

A bit annoying that the chatty bit (Completed processing batch 1/130 (1%) in 3,202ms) is in the same namespace and also at info log level. But personally, I'd rather this kind of noise than nothing here.

lread commented 1 month ago

If we were super interested in entirely controlling logging to our logging whims during this phase, we could re-implement dependency-check's processApi in clj-watson. It doesn't look like rocket science. But also, another thing to maintain and compare against changes in dependency-check, which is not attractive.

lread commented 1 month ago

And to be clear, my examples above were for a complete db download. Here's a complete example of an update:

[2024-08-01 00:30:30.040] INFO Engine - Checking for updates
[2024-08-01 00:30:33.018] INFO NvdApiDataSource - NVD API has 201 records in this update
[2024-08-01 00:30:33.183] INFO NvdApiDataSource - Downloaded 201/201 (100%)
[2024-08-01 00:30:33.496] INFO NvdApiDataSource - Completed processing batch 1/1 (100%) in 365ms
[2024-08-01 00:30:33.616] INFO KnownExploitedDataSource - Updating CISA Known Exploited Vulnerability list: https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json
[2024-08-01 00:30:34.132] INFO CveDB - Begin database defrag
[2024-08-01 00:30:37.923] INFO CveDB - End database defrag (3794 ms)
[2024-08-01 00:30:37.925] INFO Engine - Check for updates complete (7884 ms)

Which I personally do not find overly noisy.

seancorfield commented 1 month ago

Is this closed by #73 as well, or is there more to do? The level of logging shown above in your recent comment looks good to me.

lread commented 1 month ago

I think we can close, @seancorfield. We'll likely come back to logging for other reasons, but this issue is, in my opinion, addressed.

seancorfield commented 1 month ago

Thanks.