How can a step implementor identify options that are expressions?

ndw commented 3 months ago

(I'm hoping that this issue is a rubber duck and I'll work it out as I write and then delete it... Alas, nope.)

Suppose that I want to implement an atomic step, ex:wipe, that has a match option. The semantics of the ex:wipe step are that it finds all the elements that are matched and deletes all of the attributes on those elements. How do I declare this step? I think this is the best I can do:

<p:declare-step type="ex:wipe">
  <p:input port="source"/>
  <p:output port="result"/>
  <p:option name="match" select="xs:string"/>
</p:declare-step

But now consider this step in a pipeline:

...
<ex:wipe match="*[contains-token(@class, $option)]"/>

If we just naively interpret the match option as a string, then we won't know that it contained a variable reference. Consequently, the implementation may not have made the value of the $option variable (which might have been constructed by an expression elsewhere in the pipeline) available to this step.

In the standard step library, we use a special attribute to declare this intent. For example, in p:add-attribute:

<p:declare-step type="p:add-attribute">
  <p:input port="source" content-types="xml html"/>
  <p:output port="result" content-types="xml html"/>
  <p:option name="match" as="xs:string" select="'/*'" e:type="XSLTSelectionPattern"/>
  <p:option name="attribute-name" required="true" as="xs:QName"/>
  <p:option name="attribute-value" required="true" as="xs:string"/>
</p:declare-step>

But we don't make that mechanism available in any standard way to other step authors.

It seems unreasonable to say that all steps must have access to all in-scope variables at all times just in case there might be some aspect of the implementation that wants to use one. I'd certainly be concerned if we said that all steps can see all in-scope variables. Consider this pipeline:

<p:declare-step ...>
<p:option name="opt1"/>

<ex:some-step/>

I think it would be very, very unfortunate if the run-time behavior of ex:some-step could depend on the value of $opt1 given that there's nothing about the step that suggests reads the option.

There is also this monstrosity to consider:

<p:wipe match="*[contains-token(@class, ${$interpolated-variable-name}]"/>

In this case, it's impossible to determine statically what variable might be referred to in the expression so I suppose it is necessary to make all in-scope names available to the step. I'm not sure what to do about that. I sure don't like it.

ndw commented 3 months ago

On further analysis, I think it just is what it is. If the expression can be evaluated statically, you can see what variables it refers to and only those need to be made available. If the expression is computed at runtime, you just have to make them all available.

That doesn't change the fact that the compiler needs to know which options are going to be evaluated as expressions at runtime.

xml-project commented 3 months ago

Could you further elaborate your point? The specs says, that no inscope variables are available, if the XPath expression (there is no ruling for XSLTMatchPattern, but I guess its the same) is evaluated by the step. So my standard way to solve this, was to construct the XPath inside the declared step and passing the needed variables as options to the step. What exactly are you proposing to change?

ndw commented 3 months ago

Maybe I'm just confused. I wrote a toy example:

<p:add-attribute match="*[local-name(.) = $name]" .../>

And I thought it was a bug in my implementation that $name wasn't available when add-attribute was trying to evaluate the match pattern. But I guess that's just wrong and what I have to write is

<p:add-attribute match="*[local-name(.) = '{$name}'" .../>

So nevermind.

(If I'd been right, then the compiler would have to know what options on each step need access to the in-scope variables. But since I was wrong and they don't, ignore me.)

xproc / 3.0-specification

How can a step implementor identify options that are expressions? #1103