whitesource-ps / ws-nexus-integration

WhiteSource Nexus integration tool
Apache License 2.0
15 stars 8 forks source link

[FR] [ws-nexus-integration] Clean up workspace after scan is complete #38

Closed danielnbalasoiu closed 2 years ago

danielnbalasoiu commented 2 years ago

Is your feature request related to a problem? Please describe.
The current implementation works as expected; it scan Nexus container images from the given/configured repository. The main problem is that if you scan a large Nexus repository, all containers which were downloaded by the WhiteSource UA will be stored on the server from which the scan run. This can take a lot of disk space and even fill in the entire disk space.

Describe the solution you'd like
UA should clean up container images which were scanned like https://github.com/whitesource/unified-agent-distribution is doing by default.

Describe alternatives you've considered
An workaround would be to configure a cron job which removes all containers but you might end up removing containers which might be used by another application.

Additional context

danielnbalasoiu commented 2 years ago

Hi! Can you pls give me an estimation when I can test a possible fix? Thanks!

rammatzkvosky commented 2 years ago

Hi @danielnbalasoiu, This feature is part of our roadmap and we started examining it and also the FR from issue 39.

Hopefully, I would have additional details next week.

rammatzkvosky commented 2 years ago

Hi @danielnbalasoiu , What about the following approach? ( which can also cover issue 39):

For each image listed from each repo :

  1. Pull the image ( this sample is on docker hub but the principle is the same ) :
    nexus_integration: Pulling Docker image: localhost:8082/alpine:3.14
  2. Tag the image with IMAGE_NAME_tag ( this step is due to the current way images are included by the scanning agent )
C:\tmp\ws-ws-nexus>docker images
REPOSITORY                           TAG        IMAGE ID       CREATED        SIZE  
localhost:8082/alpine                3.14       a33ac4f1069a   5 days ago     5.59MB

C:\tmp\ws-ws-nexus>docker tag a33ac4f1069a localhost:8082/alpine_3.14:3.14

C:\tmp\ws-ws-nexus>docker images
REPOSITORY                           TAG        IMAGE ID       CREATED        SIZE  
localhost:8082/alpine                3.14       a33ac4f1069a   5 days ago     5.59MB
localhost:8082/alpine_3.14           3.14       a33ac4f1069a   5 days ago     5.59MB
  1. Scan the _tag image.

  2. Force remove the _tag image + pulled image( unless that image existed prior to the pull from the nexus repo ).

  3. Rename* the project name at WhiteSource UI : from localhost:8082/alpine_3.14 3.14 (a33ac4f1069a) to localhost:8082/alpine 3.14 (a33ac4f1069a)

*The rename will only work if that name is not already allocated to a project ( scanned image ) under the same product.

danielnbalasoiu commented 2 years ago

Hi!

I'm sorry, I missed the notifications. I'll try this workaround tomorrow and I'll get back to you.

danielnbalasoiu commented 2 years ago

Hi @rammatzkvosky ,

It's not clear to me how can I use ws-nexus-integration only to pull container images from Nexus repository. 🤔 (step 1) I'm guessing you used WhiteSource Unified Agent to do this, right?

rammatzkvosky commented 2 years ago

Hi @danielnbalasoiu , sorry for not being clear. The suggested steps are for us to apply in the ws-nexus-integration.

My intention was to get feedback from you on this approach ( especially steps 2-5 ).

danielnbalasoiu commented 2 years ago

Hi @rammatzkvosky ,

I understand, but it's not clear to me how can I use ws-nexus-integration to pull or to scan only specific containers.

I'm using the following config

params.config ``` [Nexus Settings] NexusBaseUrl=nexus.host:18080 NexusAuthToken= NexusUser=whitesource NexusPassword=***** NexusRepositories=ax-docker NexusAltDockerRegistryAddress=nexus.host:8085 [WhiteSource Settings] WSApiKey=***** WSUserKey=***** WSProductName=***** WSCheckPolicies=False WSUrl=https://saas-eu.whitesourcesoftware.com WSLang= [General Settings] ThreadCount=1 WorkDir= JavaBin= ```

and run ws-nexus-integration with the following command:

DEBUG=1 ws_nexus_integration

This will start listing the container images from Nexus Docker Registry, pull and scan them. I'm not aware of any option to only pull or only scan a specific container image. Can you clarify this please?

danielnbalasoiu commented 2 years ago

ping @rammatzkvosky

rammatzkvosky commented 2 years ago

Hi @danielnbalasoiu , the integration doesn't support specific containers scanning. It only supports pulling based on repositories ( Nexus Docker Registry ) names ( NexusRepositories ).

The sample I provided was an alpine:3.14 image I manually pulled from Docker hub and pushed to my nexus docker registry to be pulled and scanned later on by the ws-nexus-integration.

danielnbalasoiu commented 2 years ago

Hi @rammatzkvosky

I see. This makes sense but in my case the Docker registry in Nexus repository has thousands of container images, so pulling all of them (see issue 39), tag and scan is not an option.

If the feature from the issue 39 would be available, I could write a script which will do the cleanup after the first batch was scanned, but with the current implementation, I haven't found an workaround. Can you please take a look at my feature request?

Thank you!

rammatzkvosky commented 2 years ago

Hi @danielnbalasoiu ,

In continue to my comment. We will implement such clean up in a way that after an image scan was completed , it will be removed ( as longs as the image did not exists in the local environment - to avoid removing container used by some applications ) .

I will close this issue and continue it as part of issue #39 .