Cisco-Talos / cvdupdate

ClamAV Private Database Mirror Updater Tool
Apache License 2.0
93 stars 35 forks source link

adjust Dockerfile & entrypoint and add Dependabot, CI & docker release #51

Open monotek opened 1 year ago

monotek commented 1 year ago

This PR adjusts the Dockerfile & adds Dependabot, CI and Docker release.

Our usecase of this image is to run a database mirror inside a kubernetes cluster.

monotek commented 1 year ago

@micahsnyder

would be nice this could get a review and some feedback :)

micahsnyder commented 1 year ago

Hi @monotek thanks for the pull-request, and thanks for the reminder. Sorry about the delay. I will try to review it next week.

monotek commented 1 year ago

@micahsnyder

do you had any chance to look into it? can i support you in any way?

monotek commented 1 year ago

Friendly ping :)

micahsnyder commented 1 year ago

@monotek I'm sorry for dropping this and failing to notice the first major issue with this PR. There is an existing work in progress to add a Dockerfile to the project: https://github.com/Cisco-Talos/cvdupdate/pull/47

I intend to try to help finish cleanup on PR #47 and get it merged. After that, if you can base your work on that completed effort to extend the features you've added such as dependabot, CI, docker hub publishing - I think that would be best.

One thing to note is that your Dockerfile runs cvd update once, and then aims to serve a private mirror with the cvd serve feature. Meanwhile PR #47's Dockerfile aims to run cvd update on a schedule and does not offer the cvd serve feature.

I think running cvd update regularly makes the most sense for most people. Using cvd serve to host the mirror is less common, and others may have better/different webservers than the shoddy one used by cvd serve. And some people may not wish to host with a webserver at all.

My recommendation when updating this PR is to make it so the cvd serve feature is optional and is default-off. Perhaps it could be enabled by setting an environment variable before starting the container.

monotek commented 1 year ago

On best practice for docker containers is to be immutable. This means a container should not change it contents while running. Most environments nowdays also enforcing read only filesystems in containers so download of new virus definitions might fail anyway. The corrrect way to use new definitions would be to build a new container.

micahsnyder commented 1 year ago

And how often would you make a new container? The whole point of cvdupdate is to only update if an update is needed. If you're making a new container from scratch any time you want the latest definitions, you're misusing cvdupdate and evading the system we created intended to reduce data costs.

monotek commented 1 year ago

You could still do it by using a writable volume in a container ;-)

I would still prefer to rebuild my container (at least on a daily basis). This will also update all container dependencies like container image OS, all packages and libs. As every build get it's own version i also always know wich exact version of the container is running in my cluster.

Such builds / pushes to the registry could be done automatically. This is supported by Github actions via: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule

micahsnyder commented 1 year ago

You could still do it by using a writable volume in a container ;-)

Yup! So that's how https://github.com/Cisco-Talos/cvdupdate/pull/47 does it. 👍

I would still prefer to rebuild my container (at least on a daily basis). This will also update all container dependencies like container image OS, all packages and libs. As every build get it's own version i also always know wich exact version of the container is running in my cluster.

Such builds / pushes to the registry could be done automatically. This is supported by Github actions via: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule

Rebuilding the image daily certainly makes it so you don't have to worry much about CVE's in the base image. That would be nice. 😄
Although ideally we would only have to push up a new image if there were actually new versions of the dependencies though. Best not to waste Docker Hub's bandwidth if there is no real difference in the image.

micahsnyder commented 1 year ago

Ok @monotek merged #47.

If you're still interested in working on this, now is a good time to update your PR with the volume mount model in mind.

monotek commented 1 year ago

Thanks for the hint. I'll add some changes later this or next week.

At least the user creation in the entrypoint is wrong and should be moved to the dockerfile itself, as user creation will fail anyway on readonly fs...

ismvru commented 1 year ago

At least the user creation in the entrypoint is wrong and should be moved to the dockerfile itself

Hello! I agree that this is not correct, I just tried to make the UserID dynamic so that I could adapt to any infrastructure. I would be happy to see the corrected Dockerfile and gain some more experience.

monotek commented 1 year ago

I've adjusted the container (see updated pr description above).

Can be tested via:

Cron mode:

docker run -it --rm monotek/cvdupdate:0.0.2

Serve mode:

docker run -it --net host --rm monotek/cvdupdate:0.0.2 serve
curl http://localhost:8000

The release actions docker repo has still to be adjusted. Just tell me where it should be pushed...

monotek commented 1 year ago

@ismvru & @micahsnyder did you have time to have a look?

monotek commented 1 year ago

i updated the pr to use ghcr.io, as no extra github secrets are needed for this.

monotek commented 1 year ago

friendly ping :)

uudecode commented 6 months ago

tbh, a little problem exists with file "/root/.cvdupdate/state.json" it changes every update, and create new docker layer for this changes. Better to move state.json into mounted directory.

monotek commented 6 months ago

You mean "/cvdupdate/.cvdupdate/state.json"? Is this not wanted as it contains the "last modified" & "last checked" fields with the timestamps? If we would remove the file the information about the state of the clamav database in the container would be lost.

uudecode commented 6 months ago

You mean "/cvdupdate/.cvdupdate/state.json"? Is this not wanted as it contains the "last modified" & "last checked" fields with the timestamps? If we would remove the file the information about the state of the clamav database in the container would be lost.

Yes, mea culpa, "/cvdupdate/.cvdupdate/state.json" will be mounted from host system, when run in docker, so no problem, sorry.