Closed D1StrX closed 2 weeks ago
Thanks for filing this issue, @D1StrX.
And why does Netbox-worker need access to Netbox-media?
That's a good point. Would removing this mount resolve the issue you're facing?
Wondering this as well. I'd think you'd still have problems if you're running multiple nb replicas and they end up on different nodes.
As long as we don't use scriptsPersistence
and reportsPersistence
... but this wouldn't resolve the main issue. When scaling up the replicas this would indeed create the same issue. A couple of solutions/directions I can think of;
FileStore
, which is way to expensive for the solution to achieveNetbox-worker
wouldn't resolve the issue, since multiple replicas reintroduce the problem.Makes sense.
Then what's blocking to use the proper ReadWriteMany
access mode?
That would be the exact use case for this.
GKE doesn't support RWX at all. And trying this to work, is not succeeding: https://github.com/netbox-community/netbox-chart/issues/394.
As far as I understand, it does, just not when using Compute Engine disks. I might be wrong, but if so, do you have any reference?
Several reference points:
Error I am getting: failed to provision volume with StorageClass "<storageclass>": rpc error: code = InvalidArgument desc = VolumeCapabilities is invalid: specified multi writer with mount access type
In Google Cloud Platform, the default storage class uses gce-persistent disk as the provisioner. However gce-persistent disk does not allow RWX mode. By default, gcePersistentDisk volume only permits readonly for multiple consumers.
https://www.googlecloudcommunity.com/gc/Google-Kubernetes-Engine-GKE/pod-failed-to-use-pvc-with-standard-rwx-storageclass/m-p/796156 Solution is going straight for FileStore.
I honestly don't know what can be done in this repository for this case.
The fact GKE doesn't support ReadWriteMany
volumes for some context is quite outside our expertise/ability to fix.
And NetBox is not even special is that case, as an app with a database backend. All the others I've checked defined the exact same behavior.
I'm not even sure to see the use case, where ReadWriteOnce
, or even ReadOnlyMany
, won't cover the needs.
The "active" replication should be let to the database only, I'm not sure NetBox would be very suitable for full active replicas across nodes.
Databases are not relevant in this context. "External" databases uses StatefulSets, where each pod has its own PVC/PV. If only the Netbox container (not Worker or Housekeeping) would attach to Media, Scripts and Reports PVC the issue would be fixed. When you want to run Netbox HA; use for Media, Scripts and Reports external datasources like Git or S3.
Please give version 5.0.0-beta.137 (or above) a try.
A new option has been added to allow read only volume mounts (housekeeping.readOnlyPersistence
, worker.readOnlyPersistence
).
I believe ReadOnlyMany
should then be an adequate option.
Tested and unfortunately this requires again more work on GKE... since ReadOnlyMany
isn't 100% supported.
https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/readonlymany-disks#create-rom-pv
I am going to close this issue, since there seems not easy/good way to solve this in the Cloud (GKE). The only suggestion I can offer is to consider whether Housekeeping and Worker actually need to attach to the three optional PVCs, or if they can be removed in the Chart, leaving only the Netbox container attached. Its better to have the functionality than 2 Netbox replicas IMHO.
The Helm chart version
5.0.0-beta.112
Environment Versions
Custom chart values
Current Behavior & Steps to Reproduce
Regarding the storage behavior mentioned in https://github.com/netbox-community/netbox-chart/issues/357, I took a deeper look into the issue. Even RWO is an issue for us when your run a cluster with multiple K8s Worker nodes. The PVC, Netbox and Netbox-worker must reside on the same Worker node, otherwise you get
Multi-Attach error for volume <pvc> Volume is already used by pod(s) netbox-worker-xxx
. RWX isn't available on GKE, becausepd.csi.storage.gke.io
doesn't support it.And why does Netbox-worker need access to Netbox-media?
Expected Behavior
An alternative or perhaps improved documentation.
NetBox Logs
No response