Chaffelson / nipyapi

A convenient Python wrapper for Apache NiFi
Other
245 stars 76 forks source link

Removing parameters from parameter contexts. #216

Closed chris762 closed 3 years ago

chris762 commented 4 years ago

Description

We use the ParameterContextsApi to update our parameter values when deploying from cluster to cluster. We wanted to start issuing a cleanup step in our pipeline, where we could remove old/unused parameters from parameter contexts, which were attached to PG's being deployed. The logic is just looking for any parameters which have a "referencing_components" = []. I was told by Mark Payne today that in order to delete a key/value programmatically from a parameter context, I would need to modify the parameter value to have an explicit null. Didn't seem like a big deal because that is exactly how we update the parameters when deploying, but I noticed something which seems to be related to Pythons None/null compatibility. I am not entirely sure how the request is submitted or received, so I thought I would ask you first.

What I Did

Consider this function:

def remove_unused_parameters(pg_id):
    """ REMOVES THE PARAMETER CONTEXT WHEN IT DOESN'T HAVE ANY REFERENCING COMPONENTS """
    print("Looking for unused parameters to remove")
    deployment_server_pg = nipyapi.canvas.list_all_process_groups(pg_id=pg_id)
    # Narrow the parameter contexts ID's to a distinct list, being used by the PG being deployed.
    deployed_pg_parameter_contexts = list(dict.fromkeys([i.parameter_context.component.id for i in deployment_server_pg if i.component.parameter_context is not None]))
    param = ParameterContextsApi()
    # Loop over all the potential parameter contexts
    for ids in deployed_pg_parameter_contexts:
        parameter_body = ParameterContextsApi.get_parameter_context(param, id=ids)
        for parameters in parameter_body.component.parameters:
            # Look for any parameter which doesn't have a referencing component (Not being used)
            if not parameters.parameter.referencing_components:
                print(f"Removing unused parameter -> {parameters.parameter.name}")
                # Set the component value to None (null)
                parameters.parameter.value = None
        # Submit each parameter body to the update service.
        ParameterContextsApi.update_parameter_context(param, id=ids, body=parameter_body)

The function accepts the ID of a given processor group being deployed and:

  1. Looks for distinct parameter contexts being used by any of its processor groups.
  2. Using the ID's found for each parameter context, hits the "get_parameter_context" method and then does some simple logic on the resulting parameter_body.
  3. In this above case, its primary goal is to look for parameters which have referencing_components = [].
  4. If found, it sets the parameter.value = None.
  5. It then submits the entire parameter_body back to the update_parameter_context method to update things.

This is exactly how we modify the sensitive and non-sensitive values when we are deploying things around and it has thus far worked perfectly. The problem I see here is when we submit the None type. I can see that the parameter_body correctly saves the None type in the dict within the parameter_body.component.parameters array and it does successful submit the entire object to the update_parameter_context. The issue is nothing ever gets removed. I get an object back from the service which has the same values which were previously set. If I change that None type to be anything else (which is string based - such as "Chris"), it works just fine. My uneducated guess is something is getting mixed up between the None and expected null value the Nifi API is hoping to see (but that could be totally wrong).

Any easy way to reproduce this would be to:

  1. Create a flow with at least one PG, with a parameter context attached to it.
  2. Create a few dummy parameters within that context, but don't assign them to any processors.
  3. You can try and run that above function against the resulting pg_id that you have created. It will find your parameter context ID's attached to your pg_id and try and remove them.
  4. You should notice that nothing changes.

Urgency

This is probably more of a "Does this make sense to you?" type of thing. It does seem like something is off, but I don't know that I can pinpoint it.

Thanks and please let me know if something doesn't make sense.

Thanks, Chris Lundeberg

Chaffelson commented 3 years ago

Hi @chris762 sorry I hadn't responded to this earlier, something is up with my notifications on the repo for new issues that I need to figure out further. I have today taken some thought from your Issue in implementing controls for Paramter Contexts on the Next branch, specifically if you look at parameters.py, and the tests to go with it, I think you'll see how the delete should work. I included a convenience method which should help with your exact case here. Part of the issue you were having is probably related to using update_parameter_context, which doesn't work intuitively for Parameters that already exist, you instead want to use submit_parameter_context_update as I have in nipyapi.parameters. update_parameter_context. Hopefully this simplifies your logic a bit, please let me know if I've missed the point or you could use more convenience functions here :)

https://github.com/Chaffelson/nipyapi/blob/next/nipyapi/parameters.py https://github.com/Chaffelson/nipyapi/blob/next/tests/test_parameters.py

chris762 commented 3 years ago

Hi @Chaffelson - Thanks very much for getting back to me and making this enhancement. This looks very good and I will plan to test it out in the coming weeks. Thanks again and I will let you know if I hit any snags!

Chaffelson commented 3 years ago

Should be fixed in 0.16.0