After investigating the use of nifi to control the flow of the pipeline these points are what i found:
Creates a dependency for java. This will likely have to be the newest stable version available. Currently java 21.
Setting this up requires our code to be running smoothly beforehand. We cant really make use of nifi until we are somewhat stable.
Changing things and testing when using nifi seems to be time intensive since you need to clear up any flowfiles in the system before being able to modify.
Using drag and drop coding is annoying and gives up control.
Documenting seems time heavy especially the configuration part for each module. Probably the same as any other kind of documentation. Using the nifi gui there is a note system.
Making use of and implementing the python wrapper is not that easy. Will require a fair amount of code configuration.
When using the python wrapper a java server is set up to run locally in parallel with the python server part of nifi. This means ~double the resource usage.
We can potentially skip making use of the python wrapper and just use the module that allows for non java scripts to run. Uncertain about the limitations on this.
The python wrapper does not provide the same degree of configurability as the native java version. To what extent or if any this would affect us is uncertain.
Monitoring is much easier and it provides a nice overview of the whole flow.
Bookkeeping of the metadata of the assets will be easier to implement.
Transferring responsibilities for using and updating the whole pipeline will come with a larger overhead. Since you would have to learn to use nifi and its an extensive framework.
@bhsi-snm
After investigating the use of nifi to control the flow of the pipeline these points are what i found:
@bhsi-snm