Step for turning variables into documents

xatapult commented 3 weeks ago

I've run into the situation where I had to turn the value(s) of variables/options into documents.

The other way around (documents into variable values) is simple, since you can refer to the context item when assigning a value to a variable. From variable to document is clumsy. You have to do some tricks with identity steps with inline documents that use TVTs or something.

I would advocate for adding a p:value-document step (or whatever name). Something like:

<p:declare-step type="p:value-document">
  <p:output port="result" content-types="any" sequence="true"/>
  <p:option name="value" as="item()*" required="true"/>
</p:declare-step>

The output port emits the value of $value as document(s). What happens depends on the type of $value:

Element, document-nodes: XML
Attributes: error or maybe text (attribute value stringified)
comments, PIs: Not sure
text node, atomic type: Text document (atomic type is stringified)
Map, array: JSON

When $value is a sequence with more than one entry, multiple documents are produced. When it's the empty sequence, no documents are produced.

ndw commented 3 weeks ago

Isn't that just

<p:identity>
  <p:with-input select="$value"/>
</p:identity>

xatapult commented 3 weeks ago

Nope. I don't remember exactly, but that doesn't always work. And when there is no document on the source port, nothing will happen.

ndw commented 3 weeks ago

Well, I'd like an example of where it doesn't work because I think it should.

With respect to "if there's no document on the source port, nothing happens", I don't believe that's a conformant result.

xatapult commented 3 weeks ago

Running this in Morgana:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="3.0" exclude-inline-prefixes="#all">

  <p:output port="result" primary="true" sequence="true" content-types="any"/>

  <p:identity>
    <p:with-input select="'a'" >
      <p:empty/>
    </p:with-input>
  </p:identity>

  <p:variable name="doccount" as="xs:integer" select="count(collection())" collection="true"/>
  <p:identity message="* Documents: {$doccount}"/>
  <p:if test="$doccount eq 1">
    <p:identity message="* content-type: {p:document-property(., 'content-type')}"/>
  </p:if>

</p:declare-step>

Returns zero documents.

When I specify some dummy document on the p:identity source port, it returns a JSON document when specifying an atomic value in the select attribute. Yes, I know you can work around that using p:inline. So maybe this idea is superfluous? Let's discuss (whenever).

I will open a new VNext issue for the "empty document" problem. I've had that before. See #53

ndw commented 3 weeks ago

I don't see any reason why it should matter if you provide a dummy input document to the p:identity step. @xml-project , am I missing something?

xatapult commented 3 weeks ago

I don't see any reason why it should matter if you provide a dummy input document to the p:identity step. @xml-project , am I missing something?

It's not intuitive I agree. But it's the way Morgana implements it now and I had a discussion about this with @xml-project because I had the problem in some other context.

ndw commented 3 weeks ago

Oh, I see now. The description of select (on p:with-input) says:

If a select expression is specified, it is effectively a filter on the input. The expression will be evaluated once for each document that appears on the port, using that document as the context item. The result of evaluating the expression (on each document that appears, in the order they arrive) will be the sequence of items that the step receives on the port.

That's...tricky. It's obviously a reasonable semantic, but it's equally obvious that it leads to confusing behavior when there are no documents on the input port.

You'd like to say something like "if there are no documents on the input port, the expression will be evaluated without a context item" with appropriate consideration of what to do when that causes an error.

But that would be hugely backwards-incompatible.

I expect adding <p:inline><dummy/></p:inline> to provide a context is the most straightforward workaround.

ndw commented 3 weeks ago

The backwards incompatibility problem even prevents us from finessing the issue by changing the behavior when the expression doesn't refer to the context item. If adding a dummy input is just too ugly, then I think we'd need to add a new attribute to enable the "evaluate without a context item" behavior. ☹️

xproc / Vnext

Step for turning variables into documents #50