skupperproject / skupper-router

An application-layer router for Skupper networks
https://skupper.io
Apache License 2.0
14 stars 18 forks source link

Top down management QUERY retrieves invalid data for Router management entity #1205

Closed kgiusti closed 1 month ago

kgiusti commented 11 months ago

The io.skupper.router.router management object used to be implemented in the python code. It has been moved to C (see src/router_core/agent_router.c). However the old python RouterEntity is still registered. Turns out if you issue a management query against the short form "router" object name (rather than the proper 'io.skupper.router.router' name), the query is handled by the old python RouterEntity. The response body is incorrect and reports invalid values. No management error is generated.

Remove RouterEntity from python/skupper_router_internal/management/agent.py and nuke the src/router_agent.c file from orbit.

It's the only way to be sure...

kgiusti commented 11 months ago

I've looked into this a bit and there's a bigger issue at play.

In the current implementation most - but not all - management data is handled in the python library. Some of the management entities, including the io.skupper.router.router entity, are accessed via the C code.

When a management query for type io.skupper.router.router is made the query is handled in the C code and the python library is bypassed. This returns a complete and accurate router record.

However if a QUERY is made for the entire schema it is handled exclusively by the python library. This code does not properly fetch the io.skupper.router.router values.

The reason is due to the implementation: when a QUERY message arrives at the router it is interrogated to determine the target type. If the query is for the type io.skupper.router.router it is handled by the core router code in src/router_core/agent_router.c.

However when a top-level query is made the QUERY message does not contain a type value since the QUERY is for all types. Since the router does not see the io.skupper.router.router type in the QUERY message, the message is passed into the python library for handling. And the python library doesn't handle the router record correctly.

The python code needs to be able to get the correct router data. This data is managed by the core thread so there are synchronization issues that need to be dealt with.

ted-ross commented 11 months ago

Does any of the Skupper tooling use the top-down QUERY? Is this a candidate for removal?

kgiusti commented 11 months ago

This issue is a carry over from the Qpid Dispatch project: https://issues.apache.org/jira/browse/DISPATCH-933

kgiusti commented 11 months ago

@ted-ross From a quick review I don't see a use of the top-down query aside from some CI tests.

skstat does not use it. skmanage can do it if no --type is given (which as you point out we can remove).

If the control plane doesn't do such a query I think it makes sense to consider this lo-priority and remove the top-down query support at some point.

Another approach is given in the above dispatch jira, but that looks like it would break compatibility.