rdlopes / nifi-open-nlp

A set of NiFi processors to implement NLP flows using Apache OpenNLP
https://rdlopes.github.io/nifi-open-nlp/
3 stars 1 forks source link

Trained models live in memory #1

Closed rdlopes closed 4 years ago

rdlopes commented 5 years ago

At the moment, when a processor is added to the NiFi workflow, trained models are loaded and evaluated in memory. The more processors you add, the more memory you will require in the VM.

Currently, I've changed the memory limits of NiFi in nifi-local-data/conf/bootstrap.conf so the showcase can hold into Dockerized NiFi.

This cannot be a long-term architecture.

rdlopes commented 5 years ago

So, I'm thinking about a lifecycle that could work:

  1. Processor is added to the workflow
  2. Validation will check that all parameters are correct and will try to train the model from the given parameters
  3. If the processor is validated, then the trained model will be stored under $NIFI_HOME/model-store, under the processor identifier name, something like <uuid>.bin.
  4. When scheduled, the processor should prepare loading of the trained model stored under its identifier in the model store.
  5. On triggered, it should evaluate the flow file content against the trained model, then unload the trained model until next flow.