Closed rishabh208gupta closed 1 year ago
Hi @rishabh208gupta - the expression should work if "mscNumber" is defined in the NiFi variable registry. It will not be evaluated against FlowFile attributes, per the documentation (available via "View Usage" on PutMarkLogic):
The reason for this is that the transform object is built once when PutMarkLogic is started, and thus it can only use the variables in the NiFi registry. It's not possible to then modify it for each FlowFile.
Do you have a need for values to be sent to the transform for each FlowFile?
Hi @rjrudin , sorry my bad, yeah I get it now. My requirement is to sent a value from FlowFile attributes, so a different value for each FlowFile. Is there anything that can be done for this, thanks.
Could you add those attributes to the body of the FlowFile instead? For example, if the FlowFiles you're sending to PutMarkLogic contain JSON which is then written as a JSON document to MarkLogic, you could use an ExecuteScript processor to add the attribute values as new keys to your JSON document. Your REST transform would then have access to that data and could do whatever it needs to with it.
The FlowFile is a binary file, it could be .mp4, .png, .pdf, etc
Got it - and are you looking to include metadata about the binary file that either gets persisted as document properties or metadata or as a separate document?
The requirement is to pass in a parameter which is different for each file to the transform module, in the transform module we are using that param to be used in a separate document that we are creating and ingesting. Its like a log we are creating about the metadata of the incoming Flowfile, this meta data file needs to have the param as one of its fields.
Could you use a meta:
or property:
attribute to stash the document-specific value as a metadata key or a document property to achieve the same effect? It at least allows you to get that value to MarkLogic. I don't think the REST transform gives you access to the metadata for a document, but a technique I've used in the past is a pre-commit trigger, which will have access to all parts of the URI. At that point, you could write a little bit of trigger code to fetch the document metadata keys or properties from the URI (the binary you're writing) and use them to insert another document.
Hi @rjrudin, we wouldn't want the performance implications and the additional complexity of including a pre-commit trigger. Is there any reason why trans: was made only to read variables from the variable registry and not from expression language? Wouldn't it be better if it was made to read from both?
The issue is that PutMarkLogic
uses MarkLogic's WriteBatcher component for writing batches of documents in multiple threads, and WriteBatcher requires a ServerTransform to be configured on it before it begins. That transform is then applied to all documents in all batches. It's a good fit for when the same transform (with the same parameters) is intended to be applied against all documents, but that doesn't meet your requirements.
I am wondering if it might be simpler to just use NiFi's InvokeHttp
processor to access MarkLogic's /v1/documents
endpoint directly. The advantage with that is you can specify a transform and transform parameters specific to the document you're ingesting. The downside of not having the multi-threaded batch support of PutMarkLogic
can likely be mitigated by configuring InvokeHttp
itself to run with multiple threads.
Another approach you can take here, in addition to using InvokeHttp
as mentioned above, is to use either CallRestExtensionMarkLogic
or ExecuteScriptMarkLogic
to insert a binary document per FlowFile, with FlowFile-specific parameters either being sent to the REST extension or incorporated in the script.
Unfortunately, because PutMarkLogic
requires a single transform with a single set of transform parameters to be used, it's not possible to achieve your requirements with PutMarkLogic
. As a result, I am going to close this ticket, but please reply back if you run into any issues with one of the 3 recommended approaches here.
trans:name which can be added as a dynamic parameter to pass to the transform module in PutMarklogic processor is not being evaluated properly, its being sent as empty string if we specify a el expression ${mscNumber}, if however a hardcoded string is used, it is sent across properly.