stackabletech / nifi-operator

A kubernetes operator for Apache NiFi
Other
30 stars 5 forks source link

Make repository-sizes dynamic based on pvc sizes #354

Closed sbernauer closed 1 year ago

sbernauer commented 2 years ago

Improving some demos i noticed that my NiFi instance suddenly stopped working.

Reason is that the provenance repo run out of space. It only has 5Gi but the operator hard-codes nifi.provenance.repository.max.storage.size=10 GB IMHO something like the following would make sense:

nifi.provenance.repository.max.storage.size=<set to pvc size. If it causes problems maybe pvc-size - 500MB or so>
nifi.provenance.repository.max.storage.time=<unlimited>

DoD

Details

This is the current config

    nodes:
      config:
        resources:
          cpu:
            max: "4"
            min: 500m
          memory:
            limit: 6Gi
          storage:
            contentRepo:
              capacity: 10Gi
            databaseRepo:
              capacity: 5Gi
            flowfileRepo:
              capacity: 5Gi
            provenanceRepo:
              capacity: 5Gi
            stateRepo:
              capacity: 5Gi

This are the pvc usages

kubectl df-pv | grep -P "(NAME|nifi)"
 PV NAME                                   PVC NAME                                   NAMESPACE  NODE NAME           POD NAME                      VOLUME MOUNT NAME      SIZE   USED   AVAILABLE  %USED  IUSED  IFREE   %IUSED 
 pvc-8641ca89-ed87-4c1d-b954-6ba7a16651b0  content-repository-nifi-node-default-0     default    default-5n5cbfoy4o  nifi-node-default-0           content-repository     9Gi    5Gi    4Gi        52.00  5588   649772  0.85   
 pvc-df2633f6-efac-45d0-9517-a6a4cd85011d  flowfile-repository-nifi-node-default-0    default    default-5n5cbfoy4o  nifi-node-default-0           flowfile-repository    4Gi    12Mi   4Gi        0.25   15     327665  0.00   
 pvc-21329c7f-87f6-4ace-a621-6bb05e6f21cf  provenance-repository-nifi-node-default-0  default    default-5n5cbfoy4o  nifi-node-default-0           provenance-repository  4Gi    4Gi    272Mi      94.51  916    326764  0.28   
 pvc-2d585950-23dd-4f72-bd09-30266b3182e8  state-repository-nifi-node-default-0       default    default-5n5cbfoy4o  nifi-node-default-0           state-repository       4Gi    156Ki  4Gi        0.00   45     327635  0.01   
 pvc-a9f14349-b88f-4450-b41b-0ab902a14881  database-repository-nifi-node-default-0    default    default-5n5cbfoy4o  nifi-node-default-0           database-repository    4Gi    168Ki  4Gi        0.00   15     327665  0.00

Config

cat nifi.properties 
nifi.administrative.yield.duration=30 sec
nifi.authorizer.configuration.file=/stackable/nifi/conf/authorizers.xml
nifi.cluster.flow.election.max.candidates=
nifi.cluster.flow.election.max.wait.time=1 mins
nifi.cluster.is.node=true
nifi.cluster.node.address=nifi-node-default-0.nifi-node-default.default.svc.cluster.local
nifi.cluster.node.protocol.port=9088
nifi.cluster.protocol.is.secure=true
nifi.components.status.repository.buffer.size=1440
nifi.components.status.repository.implementation=org.apache.nifi.controller.status.history.VolatileComponentStatusRepository
nifi.components.status.snapshot.frequency=1 min
nifi.content.claim.max.appendable.size=1 MB
nifi.content.repository.always.sync=false
nifi.content.repository.archive.enabled=true
nifi.content.repository.archive.max.retention.period=7 days
nifi.content.repository.archive.max.usage.percentage=50%
nifi.content.repository.directory.default=/stackable/data/content
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.viewer.url=../nifi-content-viewer/
nifi.database.directory=/stackable/data/database
nifi.documentation.working.directory=./work/docs/components
nifi.flow.configuration.archive.dir=/stackable/nifi/conf/archive/
nifi.flow.configuration.archive.enabled=true
nifi.flow.configuration.archive.max.count=
nifi.flow.configuration.archive.max.storage=500 MB
nifi.flow.configuration.archive.max.time=30 days
nifi.flow.configuration.file=/stackable/data/database/flow.xml.gz
nifi.flowcontroller.autoResumeState=true
nifi.flowcontroller.graceful.shutdown.period=10 sec
nifi.flowfile.repository.always.sync=false
nifi.flowfile.repository.checkpoint.interval=20 secs
nifi.flowfile.repository.directory=/stackable/data/flowfile
nifi.flowfile.repository.implementation=org.apache.nifi.controller.repository.WriteAheadFlowFileRepository
nifi.flowfile.repository.retain.orphaned.flowfiles=true
nifi.flowfile.repository.wal.implementation=org.apache.nifi.wali.SequentialAccessWriteAheadLog
nifi.flowservice.writedelay.interval=500 ms
nifi.h2.url.append=;LOCK_TIMEOUT=25000;WRITE_DELAY=0;AUTO_SERVER=FALSE
nifi.login.identity.provider.configuration.file=/stackable/nifi/conf/login-identity-providers.xml
nifi.nar.library.autoload.directory=./extensions
nifi.nar.library.directory=./lib
nifi.nar.working.directory=./work/nar/
nifi.provenance.repository.always.sync=false
nifi.provenance.repository.buffer.size=100000
nifi.provenance.repository.compress.on.rollover=true
nifi.provenance.repository.concurrent.merge.threads=2
nifi.provenance.repository.directory.default=/stackable/data/provenance
nifi.provenance.repository.implementation=org.apache.nifi.provenance.WriteAheadProvenanceRepository
nifi.provenance.repository.index.shard.size=500 MB
nifi.provenance.repository.index.threads=2
nifi.provenance.repository.indexed.attributes=
nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, ProcessorID, Relationship
nifi.provenance.repository.max.attribute.length=65536
nifi.provenance.repository.max.storage.size=10 GB
nifi.provenance.repository.max.storage.time=30 days
nifi.provenance.repository.query.threads=2
nifi.provenance.repository.rollover.size=100 MB
nifi.provenance.repository.rollover.time=10 mins
nifi.queue.swap.threshold=20000
nifi.security.allow.anonymous.authentication=false
nifi.security.keystore=/stackable/keystore/keystore.p12
nifi.security.keystorePasswd=secret
nifi.security.keystoreType=PKCS12
nifi.security.truststore=/stackable/keystore/truststore.p12
nifi.security.truststorePasswd=secret
nifi.security.truststoreType=PKCS12
nifi.security.user.authorizer=authorizer
nifi.security.user.login.identity.provider=login-identity-provider
nifi.sensitive.props.algorithm=NIFI_ARGON2_AES_GCM_256
nifi.sensitive.props.key=jfk6AnWozMfkQlJ
nifi.sensitive.props.key.protected=
nifi.state.management.configuration.file=./conf/state-management.xml
nifi.state.management.embedded.zookeeper.start=false
nifi.state.management.provider.cluster=zk-provider
nifi.state.management.provider.local=local-provider
nifi.status.repository.questdb.persist.component.days=3
nifi.status.repository.questdb.persist.location=./status_repository
nifi.status.repository.questdb.persist.node.days=14
nifi.swap.manager.implementation=org.apache.nifi.controller.FileSystemSwapManager
nifi.templates.directory=./conf/templates
nifi.ui.autorefresh.interval=30 sec
nifi.ui.banner.text=
nifi.web.https.host=nifi-node-default-0.nifi-node-default.default.svc.cluster.local
nifi.web.https.network.interface.default=
nifi.web.https.port=8443
nifi.web.jetty.threads=200
nifi.web.jetty.working.directory=./work/jetty
nifi.web.max.header.size=16 KB
nifi.web.proxy.context.path=
nifi.web.proxy.host=85.215.194.27:30671,default-5n5cbfoy4o:30671,85.215.233.125:30671,default-6cljbrq2at:30671,85.215.194.120:30671,default-k4qezqd2uq:30671,85.215.160.5:30671,default-u6rswpvdas:30671,nifi.default.svc.cluster.local
nifi.zookeeper.connect.string=zookeeper-server-default-0.zookeeper-server-default.default.svc.cluster.local:2282
nifi.zookeeper.root.node=/znode-4fb22e61-13bc-46cc-8b24-bbdd865742dc
nightkr commented 1 year ago

I agree that focusing on size retention makes more sense for us since PVCs are isolated based on that anyway. I think for now it'd probably be ok to let time retention be configured through config overrides if it is actually desired by the user?

nightkr commented 1 year ago

Cross-referenced which repository types would need to be configured this way, but I think otherwise the original ticket did a pretty good job.

lfrancke commented 1 year ago

I'm fine with only having size based retention but the description is still a bit too unclear for me. This will require a CRD change, right? I think what you're suggesting is that the PVC size is split up between all repositories automatically or should the user select a percentage of size?

e.g. flow: 50%, archive: 30%, provenance: 20% ?

sbernauer commented 1 year ago

Currently this is a bug as nifi.provenance.repository.max.storage.size=10 GB is hard-coded. The demo used a pvc size of 5Gi for provenance and bad things happened. Each repository gets a own pvc. No CRD change is needed IMHO. We only remove the hard-coded nifi.provenance.repository.max.storage.size=10 GB setting and put in the actual pvc size.

sbernauer commented 1 year ago

The

e.g. flow: 50%, archive: 30%, provenance: 20% ?

part is decided by the user. He needs to specify the pvc sizes for every repository individually

lfrancke commented 1 year ago

My comments were based on a misunderstanding of the proposal. I thought that the CRD snippet above is the proposed new one and didn't realize that this already exists.

Time based retention is out of scope for now. Please make sure to leave a safety buffer of at least 100MB (which means we also need to validate that a storage PVC is at least 101MB in size :)

lfrancke commented 1 year ago

Almost none of the checkboxes are ticked. Neither in the PR nor here, can you make sure that everything is done?

sbernauer commented 1 year ago

Checked implementation and the boxes