stevespringett / nist-data-mirror

A simple Java command-line utility to mirror the CVE JSON data from NIST.
Apache License 2.0
206 stars 93 forks source link

Download and uncompression are slower than expected. #115

Closed trandersen-ufst closed 2 years ago

trandersen-ufst commented 2 years ago

Downloading and uncompressing from scratch takes almost 3 minutes on my machine. Initial profiling showed that InputStreams and OutputStreams are not buffered.

trandersen-ufst commented 2 years ago

I am creating a pull request from my fork where I cannot create issues.. I was expecting to be able to create a properly named branch after creating this issue. Apparently this is not how it is done. I will submit a proper pull request in a few days.

trandersen-ufst commented 2 years ago

"master" as of now takes 50 minutes on my Windows 10 machine. Initial experiments with appropriate buffering gives a run time of 2-3 minutes.

trandersen-ufst commented 2 years ago

Got down to 44 seconds. Pull request at https://github.com/stevespringett/nist-data-mirror/pull/116

sellersj commented 2 years ago

I noticed a performance issue too. Moving from one VM to another, the disk must be configured differently since running the mirror went from about 5 minutes to 45 minutes and seems to be mostly due to the lack of I/O buffering.

Thanks @trandersen-ufst for your PR. I came here to submit the same thing.

trandersen-ufst commented 2 years ago

@sellersj You're welcome. Now a month later we ended up not using this but instead saving the dataDirectory of a normal dependency-check:check run in a SNAPSHOT Maven artifact which can then be unpacked when needed for "almost-offline" usage in our CICD build server. As the download+update bit is then decoupled from the actual check, we expect this to be very robust.

sellersj commented 2 years ago

PR https://github.com/stevespringett/nist-data-mirror/pull/116 merged so I think that this is resolved. I tested it with the 1.6.0 release and it was much faster.

stevespringett commented 2 years ago

Fantastic! Thanks for testing.