atc0005 / go-lockss

Go-based tooling for monitoring and troubleshooting LOCKSS nodes.
MIT License
2 stars 1 forks source link

XPATH expression used to determine peer nodes doesn't take into account potential use of 'LOCKSS_TEST_GROUP' setting #21

Closed atc0005 closed 4 years ago

atc0005 commented 4 years ago

This affects v0.1.1 and earlier.

When the current XPATH expression is used to determine peer nodes, it matches more sections than intended. This isn't a problem when there is only one id.initialV3PeerList block in the configuration/properties XML file, but for at least one LOCKSS network there are multiple id.initialV3PeerList blocks, each wrapped with tests like this one (headers and "rules" removed):

<lockss-config>
    <property name="org.lockss">
        <if>
            <or>
                <test group="abc" />
                <test group="def" />
                <test group="prod" />
                <test group="hij" />
            </or>
            <then>
                <property name="id.initialV3PeerList">
                    <list>
                        <value>TCP:[1.2.3.4]:9729</value>
                        <value>TCP:[5.6.7.8]:9729</value>
                    </list>
                </property>
            </then>
        </if>
        <if>
            <or>
                <test hostname="test1.example.org"/>
                <test hostname="test2.example.org"/>
                <test hostname="test3.example.org"/>
                <test hostname="test4.example.org"/>
            </or>
            <then>
                <property name="id.initialV3PeerList">
                    <list>
                        <value>TCP:[4.3.2.1]:9729</value>
                        <value>TCP:[8.7.6.5]:9729</value>
                    </list>
                </property>
            </then>
        </if>
    </property>
</lockss-config>

The current logic expects this:

<lockss-config>
    <property name="org.lockss">
                <property name="id.initialV3PeerList">
                    <list>
                        <value>TCP:[1.2.3.4]:9729</value>
                        <value>TCP:[5.6.7.8]:9729</value>
                    </list>
                </property>
    </property>
</lockss-config>

Instead of getting just two peers as intended, the current version of the code in this repo returns all four peers.

atc0005 commented 4 years ago

Glancing over this doc, I can already tell where I went wrong:

https://dh.obdurodon.org/introduction-xpath.xhtml#axes

A slash (/) normally indicates a step in a path expression, telling the system to look for whatever follows with reference to the current context. This means that, for example, paragraph/quote means find all of the elements that are children of the current context and then (slash = new step in the path) all of the elements that are children of each of those elements. A slash at the very beginning of a path expression, though, has a special meaning: it means start at the document node, at the top of the tree. Thus, /paragraph means find all of the elements that are immediate children of the document node, a query that will succeed only if the root element of the document (the one that contains all other elements) happens to be a (and therefore immediately under the document node).

A double slash (//) is shorthand for the descendant axis, so that chapter//quote would first find all of the elements that are children of the current context and then find all of the elements anywhere within them, at any depth (children, children’s children, etc.). When used at the beginning of a path expression, e.g., //paragraph//quote, the double slash means that the path starts from the document node, at the top of the tree, and looks on the descendant axis. The preceding XPath expression therefore means starting from the document node, find all descendant elements (= all elements anywhere in the document), and then find all elements anywhere inside those elements. This is one way to find all elements anywhere inside elements at any depth, while ignoring elements that are not inside elements.

From the second quoted paragraph above:

A double slash (//) is shorthand for the descendant axis, so that chapter//quote would first find all of the elements that are children of the current context and then find all of the elements anywhere within them, at any depth (children, children’s children, etc.).

Current code:

https://github.com/atc0005/go-lockss/blob/39516afe7cd7e911517124df5872b1b6a0206c2c/internal/lockss/peers.go#L74

I evidently realized that having a double-slash between list and value didn't work, but got "working" results from using double-slashes (likely based off of a thin tutorial) and didn't pursue further.

atc0005 commented 4 years ago

Quick test:

index e7335a5..c685cb8 100644
--- a/internal/lockss/peers.go
+++ b/internal/lockss/peers.go
@@ -71,7 +71,7 @@ func (l IDInitialV3Peers) List() ([]V3Peer, error) {
                return nil, fmt.Errorf("error compiling regex: %w", regExCompileErr)
        }

-       xpathExpression := "//property[@name='id.initialV3PeerList']//list/value"
+       xpathExpression := "//property[@name='org.lockss']/property[@name='id.initialV3PeerList']/list/value"

        xmlqueryNodes, err := xmlquery.QueryAll(l.xmlDoc, xpathExpression)
        if err != nil {

This matches the explicit path, requiring that the <property name="org.lockss"> element directly precede <property name="id.initialV3PeerList">. This works as before on two of our LOCKSS network nodes, while failing to retrieve any results from a third node which uses a config file with the format described in the OP. This is mostly as intended for a quick hotfix.

It's probably worth merging that change in now as it "fails fast" for a configuration that the application is not yet updated to handle properly, but yet still works as before for networks that do not use a group-based peers list.

atc0005 commented 4 years ago

All of these seem to work:

/lockss-config/property[@name='org.lockss']/if/then[preceding-sibling::or/test[@group='prod']]/property[@name='id.initialV3PeerList']/list/value

/lockss-config/property[@name='org.lockss']/if/or/test[@group='prod']/../../then/property[@name='id.initialV3PeerList']/list/value

//then[preceding-sibling::or/test[@group='prod']]/property[@name='id.initialV3PeerList']/list/value
atc0005 commented 4 years ago

At this point I think I've figured out the next steps:

There is probably a more natural way to handle this using XPath, but I'm not familiar enough with it at this time to use a different approach.