puppetlabs / puppetserver-helm-chart

The Helm Chart for Puppet Server
Apache License 2.0
47 stars 55 forks source link

Is mandatory that var-dir is shared between masters and compilers? #220

Open cpiment opened 3 months ago

cpiment commented 3 months ago

Use Case

I'm trying to deploy a multi-master, multi-compiler Puppet environment. To achieve that, some PVCs must be shared between masters and compilers. The shared paths are:

By default, these paths are created in different PVCs, and those PVCs are shared between masters and compilers, so, for them to work correctly must be deployed in a RWX PV. In my environment, the only solution for RWX PVs is very slow when it has to handle many small files, so I have problems when I try to share the vardir directory between many nodes, because every pod copies the vendored-jruby directory into the PVC during startup

Is really necessary for that directory to be shared between instances? Could vardir directory be a local directory inside every pod that is not shared between masters and compilers?

Describe the Solution You Would Like

vardir directory local in each instance and not shared

Describe Alternatives You've Considered

I'm currently testing puppet using a single master and a PVC in RWO to avoid this issue.

Additional Context

N/A

Xtigyro commented 3 months ago

@cpiment That hasn't been tested. Could you please test it yourself and come back with the results?

cpiment commented 3 months ago

I'll try and test it and get back to you with the results, but when the chart was developed, surely extracting the vardir directory to a PVC was a deliberate decision, wasn't it? Is there any files in the vardir that must be shared between puppet server instances? I have searched documentation regarding multi-master installations of Puppet and I cannot find anything regarding that directory.

ldaneliukas commented 3 months ago

I have a similar albeit bigger issues where all volumes have to be RWO rendering any multi-master/compiler setup unusable due to shared PVs. I'm planning on looking for different options to have this running in such scenarios, but that'll be a little while later.

I'm not sure why one would need to share the var-dar between the compilers. I'm currently running a docker-host based setup with multi-compilers and these servers have nothing shared in between them, we just ensure that all environments are synced immediately on all compilers whenever code changes are pushed to git in the control repository. I do see why the environment directory would be shared (if sync is slower, e.g. using r10k and no webhooks) but not var-dir.

All in all, I believe that the only mandatory thing to share is the CA directory of the master/ca server. Everything else should be able to run independently. I'll test this out with the chart, but as said, that'll be later on as I'm short on time now.

If you do test this, I'm eager to see the results.

cpiment commented 3 months ago

I have been testing setting a new value in .Values.puppetserver.persistence.data named enabled with default value true to keep the current behaviour. I changed it to false and modified all the references to puppet-serverdata-storage to make them like this:

        {{- if eq .Values.puppetserver.persistence.data.enabled true }}
        - name: puppet-serverdata-storage
          persistentVolumeClaim:
            claimName: {{ template "puppetserver.persistence.data.claimName" . }}
        {{- end }}

So that volumes are only created if enabled is true. I have deployed a multi-master deployment after this change and everything seems to be working fine, although I will keep monitoring it for the next days.

Xtigyro commented 3 months ago

Sounds about right, thank you, @cpiment!

ldaneliukas commented 3 months ago

Great, what about other volumes? IMHO if this is optional on one of them, it should be optional on all that can support not being shared. Then we might think of something more global, i.e. enabling/disabing PV usage for master/compiler in general instead of selecting individual volumes. I know there's various K8S implementations that don't rely on public cloud offerings that might not necessarily even have PVs and/or might not have RWX PV support. WDYT?

cpiment commented 3 months ago

Yeah, sure, I was using the vardir storage as a sample, but I think we have to decide which data must be stored in a PV and which not. For example:

What do you think? Any comments? I think persistence can be disabled only in vardir and confd

ldaneliukas commented 3 months ago

Yeah, sure, I was using the vardir storage as a sample, but I think we have to decide which data must be stored in a PV and which not. For example:

  • (code) /etc/puppetlabs/code/

    • Must be persisted:

    • Case A: r10k as sidecar -> must use a PV to share data between sidecar container and puppetserver (master/compiler) container

    • Case B: r10k as standalone -> must use a PV to share data between r10k pod and puppetserver (master/compiler) pods

  • (puppet) /etc/puppetlabs/puppet/

    • Must be persisted? I think so because there is certificate data that cannot be regenerated at every run
  • (vardir) /opt/puppetlabs/server/data/puppetserver/

    • Must be persisted? This is the change I have done and I'm still testing it, but it looks good so far.
  • (confd) /etc/puppetlabs/puppetserver/conf.d/

    • Must be persisted? I don't think so because the entrypoint scripts try to regenerate the contents of this directory at startup time.
  • (ca) /etc/puppetlabs/puppetserver/ca

    • Must be persisted in a PV to avoid losing certificates due to a restart. If multi-master, it has to be persisted and shared between master nodes (RWX access)

What do you think? Any comments? I think persistence can be disabled only in vardir and confd

For vardir and confd - Agreed, I don't think there is any need for these to be persisted or shared between pods.

For code - Agreed, this needs persistence and RWX as it is shared between pods for Puppet environments to be in sync. At least with the current r10k implementation that the chart offers. IMHO we should still work on offering a webhook based approach that allows GitLab/GitHub/etc to notify a sidecar container of all required pods to deploy the data individually, but that's another topic.

For puppet - As per the docs, the /etc/puppetlabs/puppet/ssl directory contains the certificates that were signed for the machine in question, so it should be persisted. But this should be RWO, since each master/compiler would have unique certs of their own?

For ca - Yes, all of the certs signed by the CA are here. IMHO this should be RWO in single-master and RWX in multi-master.

Xtigyro commented 3 months ago

@cpiment @ldaneliukas Sounds about right, yes.

I've tried hard to make the chart as good as possible but at some point I really didn't have so many free hours weekly to continue doing so.

Thank you both!

@cpiment Sounds about right that you should be an official maintainer. ;-)

cpiment commented 3 months ago

Do you agree then to create this two parameters:

.Values.puppetserver.persistence.data.enabled .Values.puppetserver.persistence.confd.enabled

With the default value to true to keep the current chart behaviour, but if they are change to false, they disable the use of persistent volumes for

/opt/puppetlabs/server/data/puppetserver/ /etc/puppetlabs/puppetserver/conf.d/

If you are OK with that approach, I can submit a PR with these changes in a few days time.

Xtigyro commented 3 months ago

@cpiment Absolutely!