Closed ndw closed 4 years ago
I do not think this is a step issue, but must be stated in the core specs. If we want this behaviour, then it must be the same for EVERY atomic step a processor know, i.e. not only those specified by the step specs, but also by processor defined steps or by steps defined in third party package.
My main argument against @ndw's proposal: It is to late now. It is not described neither in the core specs nor in the step specs. I think we should try to come to an end and not invent new features all the time.
As user I think (hmm, not sure) I would expect that a document generated by some step where there is no clear ancestor before the step (like p:count etc.) has no base-uri. Because it appeared out of "nothing".
If we must give it a base-uri for some reason, then it should be the base-uri of the step. But again, I think it should have none. This preference is not very strong and I have a feeling I don't oversee all consequences, pros and cons. So, I would not stand in the way of a solution where the base-uri of the step was used.
About in the core spec or not: Why should it be in the core spec? We could record it as a preferable behavior, but I don't see any problem in custom steps taking different directions here.
Let’s define what documents created by steps are, whether there is a difference between documents modified by steps and documents springing into existence from a step.
Technically, there is no document identity so even p:add-attribute
creates new documents.
My expectation is that the result document of p:add-attribute
has the same base URI as the source document. I think this has to be the case because we say that all document properties are preserved and the base-uri
property is the same as / synchronized with the base URI.
Then there is p:xslt
, of which I just discovered that no document properties are preserved. This is different from XProc 1.0 where the non-sequence result document inherited the base URI from the first source.
So I think the question is limited to the steps that don’t preserve document properties, in particular the base-uri
property.
(I’m not arguing for or against anything up to here; I’m just trying to further contain the problem space.)
Then I wonder what you mean with “a step’s base URI”. I think you mean the base URI of the pipeline document that uses the step in question, as opposed to a document that the step was declared in. The latter is unavailable for processor-implemented steps, or there will only be p:library
documents with placeholder declarations. Therefore I think you are talking of the pipeline document that happens to use p:count
, p:archive
, etc. This will be most likely this pipeline document’s static base URI.
As in the case of whether manipulating the base-uri
property may be used to manipulate a document’s base URI: In my view it was not a new feature but a clarification of something that the spec left a bit unclear.
With the current question, we should at least provide a bit more of explicit clarity. At least we should say that in cases where the document properties are preserved, so is the base URI. (I think you can only reasonably speak of property preservation if a step has a single input port that is primary and a single output port that is primary.) And in cases where the base-uri
property isn’t preserved, we should at least say that it is implementation-defined whether the document has a base URI and which it is.
This is of course inconvenient for pipeline authors as it may limit their pipeline’s portability.
In practice, it probably won’t be much hassle since it will rarely be an issue that certain documents don’t have a base URI. You wouldn’t do anything with a p:count
result’s base URI, and you would supply an explicit storage location URI if you want to store or unarchive an archive. If the archive had as base URI the static base URI of the pipeline that created it, you wouldn’t be able to do anything useful with that URI.
By my reading, that's two votes for "shouldn't have a base URI" and one observation that not having a base URI would rarely be a problem. I think the fact that we've come this far without noticing that our implementations differ on this point supports the assertion that it'll rarely be a problem.
I propose that my implementation is in error and steps that say "no properties are preserved" should have no base URI.
I'm not even sure that any spec changes are necessary.
Closed by #314
In the course of examining the consequences of an archive (passed to
p:archive
) having no base URI, Achim and I have discovered an incompatibility in our implementations.My implementation makes the base URI of documents created by steps the same as the step unless there's some overriding value. For example, the documents created by
p:count
,p:compare
, andp:archive
(in the case of creating a new archive) all use the base URI of the step as their base URI.I think there are several reasons why this is a good idea:
Achim, quite reasonably I think, took the position that the documents produced by those steps have no base URI.
We must clarify this. I assert that this is a step spec issue (not a language spec issue) because I think it's a question about the behavior of the standard steps. Someone writing their own steps might choose to take a different approach.
We should try to resolve this quickly as it's a lot of work for one of us to change our implementation.