Closed iherman closed 6 years ago
One argument may be simplicity... Using ERS may be complex. But there are indeed two ways to express similar things...
If scope covers all the use cases we had in mind when inventing Embedded Resource Selector, then I'd rather not add ERS - both because scope seems intuitively less complex and to stay closer to Web Annotations.
I am afraid that it is the other way round: ERS covers our used cases for scope
, and more.
The example with scope in the document can be reproduced, using ERS, via:
{
"source": "https://dauwhe.github.io/html-first/MobyDick.wpub"
"selector": {
"type": "EmbeddedResourceSelector",
"value": "https://dauwhe.github.io/html-first/MobyDickNav/html/c001.html",
"refinedBy: {
"type": "CssSelector",
"value": "#elemid > .elemclass + p"
}
}
}
though it is of course more verbose than the original example:
{
"scope": "https://dauwhe.github.io/html-first/MobyDick.wpub",
"source": "https://dauwhe.github.io/html-first/MobyDickNav/html/c001.html",
"selector": {
"type": "CssSelector",
"value": "#elemid > .elemclass + p"
}
}
However, what I have a problem with is the longer example for range+ERS: I do not see a way of reproducing it with scope
alone.
I think the issue for keeping or not scope
includes:
scope
may be more intuitive and less verbose, see the example abovescope
.An additional minor "con" is the specification of scope
. The current text is simply a verbatim copy of the WA definition:
The relationship between a Locator and the resource that provides the scope or context in this selection.
which is fairly general. We would have to define, in a normative sense, what it means for specifically Web Publications, which means we go (slightly) beyond what WA defined.
/cc @azaroth42
The example for range+ERS (really ERS+Refinement) seems to me better handled with the Multi Resource Selector since it is a selection that spans two resources. So, if we keep the Multi Resource Selector, then I'm not sure that ERS would be required for anything not handled by scope. Here's how I might do the range+ERS example as a Multi Resource Selector. There remains a question in my mind whether the *.wpub should be a scope rather than a source, but in WA source is required for all selectors and since an extension/refinement we already have is this draft is that the source is implicitly also a scope if no scope is explicitly specified, perhaps we're okay. (Alternatively, this might suggest that MRS is something different than a selector?) Aside: the selectors key may need to be renamed, since technically it is an array of SpecificResources rather than selectors - maybe resourceSelections?
{ "source": "https://dauwhe.github.io/html-first/MobyDick.wpub",
"selector": {
"type": "MultiResourceSelector",
"selectors": [{
"source": "https://dauwhe.github.io/html-first/MobyDickNav/html/c001.html",
"selector": {
"type": "TextQuoteSelector",
"exact": "Call me Ishmael."
}
},{
"source": "https://dauwhe.github.io/html-first/MobyDickNav/html/c002.html",
"selector": {
"type": "TextQuoteSelector",
"exact": "A hundred black faces turned around"
}
}]
}
}
However, this means that MRS must be defined in terms of range. (Which may mean that MRS is a misnomer, it should be something like MultiResourceRange or something...)
If this is the only use case we have, then you may be right.
@tcole3 I just realized, that we fell into the same trap, and that the definition of MRS is actually wrong.
The whole Selector Model is based on having a top object (the one we call now Locator) that identfies the source. The source
attribute van appear only there and not in a Selector. Ie, the example above is incorrect (which means that MRS definition in the document is incorrect, too). We already had this discussion in an earlier round with @azaroth42. This is the reason we need the ERS, ie, to select within the top level source.
I have the impression that we need ERS and, possibly, MRS (though properly defined). The only issue that we may have is whether we need scope
or not...
I would reverse your logic and define a Mulit-Resource Selector that handles all Resource Selection (Locator) use cases that involve more than one constituent Web Publication Resources within a Web Publication - this definition should say that the scope of the MRS is (for practical purposes) always the same as the MRS source and therefore is implicit. (We should not attempt to define a means to express a Locator that spans multiple Web Publications.)
Then I do not think we need an Embedded Resource Selector at all since the context for use cases involving only a single Web Publication Resource can be handled by scope and the single-resource selectors already defined by Web Anno.
In other words drop ERS and further flesh out the definition of MRS to make sure it meets use cases.
Or is there another use case requiring ERS that I'm forgetting?
@tcole3 I just realized, that we fell into the same trap, and that the definition of MRS is actually wrong.
Withdrawn. It is not wrong, I was misled by the selectors
property name, which is a misnomer. It should be locators
(just as @tcole3 said).
(Never make serious comments later in the evening. I am not an evening person, I am not an evening person, I am not an…)
@tcole3,
I have gone through three obvious examples/use cases to see how they would/could be encoded using scope
, ERS, and MRS. I did not want to put the examples into this comment, it would have been way too long, but they are on a separate gist. My own remarks on these examples:
scope
is clear for this use case. We may have to give a more exact semantic meaning to scope
with regard to Web Publications (which is fine, we can do that)scope
has a limited, ehem, scope…A comment on the range vs. MRS issue. If our primary usage for MRS is such special ranges, we should define MRS more tightly, modeled after the definition of the Range Selector (i.e, something saying that the range includes the portion of the selection in the first Locator, includes all intermediate ones in order, and finish by the range in the last Locator). If there are other use cases for MRS, we may need a sibling selector type that does not convey the notion of order in the resources.
I continue to feel a bit uneasy about the semantics of MRS, though. We are mixing a bit the roles of a Locator and a Selector, insofar as we rely on embedded Locator(s) within a Selector. It, sort of, works, but I have some sort of an uneasiness about it that I cannot properly express. @azaroth42 may help out here…
My (current) conclusions:
scope
, but firm up its definition for WP-s.(sorry, pushed the wrong button and closed the issue; reopening...)
Looking at the latest example, provided by @RachelComerford, for MRS raises some further issues.
The example uses MRS in such a way that it does not mean some sort of a continuous selection (this is emphasized by @RachelComerford). But this is in contradiction with the definition of that selection which I have not yet changed in the corresponding PR (#20):
A Multi Resource Selection can be used to identify this span by creating an ordered list of Locators. The selection consists of everything from the beginning of the starting selector in the first Locator, all selections identified by the intermediate Locators in the list (if any), through to the beginning of the ending selector, but not including it.
It strikes me that there are two ways of looking at an MRS:
We have a perfect example use case for the first option. But we have not covered the 2nd one.
I wonder if:
(This does not decide whether we need scope
and ERS, but we would certainly need MRS or, say, MS.)
Cc: @azaroth42 @RachelComerford @tcole3 @BigBlueHat
I have a very nontechnical understanding of this and I confess, I'm not entirely sure I'm tracking the conversation above so please tell me if these comments aren't relevant or if I'm missing something.
Some assumptions I'm making:
My understanding is that we need 4 types of selector (at least based on my experience and reading above):
So, the definition would be something like: A multi resource selection identifies a collection of discrete (containing clearly defined starting and ending points) section(s) contained either within a single resource or across multiple resources.
@RachelComerford
Some assumptions I'm making:
- Resource, based on the conversation above is that this is, for example, an xhtml file - is this correct?
XHTML/HTML/SVG, etc. So basically yes.
- Selector covers a range - something with a beginning and an end
- Versus locator, which covers a single location... a starting point with no defined end point
Not really. A selector is an abstract thing, which depends very much on the specific selector type. A CSS selection may actually select a number of elements in the HTML file (depending on the selector used), whereas a fragment selector typically selects one element only. The term locator is just a generic term that says: "this is the resource I use for a selection and here is the specific selector type I use".
My understanding is that we need 4 types of selector (at least based on my experience and reading above):
- Identify the start and end of a selection of of content within a single "resource."
That is what the RangeSelector does.
- Identify the start and and end of a selection that begins in one resource but ends in another and includes all of the content in between.
Yes. At the moment, ie, in the current draft, it is possible to do that using the RangeSelector and the EmbeddedResourceSelector, see example 14.
But it does it only partially: it defines a start and end, but it is unclear what is in between, so to say. Ie, if you select something in chapter 1 for the beginning and in chapter 3 for the end, does it mean that we select chapter 2 as a whole in between? Based on what? There has been quite some discussions on whether the default order makes sense or not. If not, then… Hence my proposal to extend the range selector to be able to explicitly list chapter 2 (in this example) as an 'intermediate' resource.
- Identify the start and end of a selection within a resource, the start and end of a selection within a second resource, the start and end of a selection within a third resource, etc. without including all of the content between those sections.
Yes. If we adopt a weaker version of the Multiple Resource Selector, which is used in your example for #20, this is something that can be done.
- Identify the start and end of a selection within a resource and the start and end of a selection within that same resource without including all of the content between those selections.
Indeed: if the Multiple Resource Selector becomes a Multiple Selector, ie, not necessarily based on several resources, then this becomes a special case of your (3) above.
So, the definition would be something like: A multi resource selection identifies a collection of discrete (containing clearly defined starting and ending points) section(s) contained either within a single resource or across multiple resources.
Heh. You claim you are not a technical person? Wrong!! :-) The definition is exactly what a Multiple Selector would be…
@tcole3 @BigBlueHat I was wondering a bit about the current design for the extensions, and I am not really sure we are heading in the right direction. Again, to avoid polluting the issue comments with long text, I put down my thoughts into a separate gist.
I am happy to amend the current, not-yet-merged PR #20 branch to reflect this if you guys agree with the direction.
@iherman added some thoughts https://gist.github.com/iherman/65254a9e914de0af319a6800936af39e#gistcomment-2243202
I'd also love @tilgovi and @treora to give the gist a quick skim if they can spare some cycles! 😃
Closing by virtue of merging #23. More specific issues have also been added for further discussion: #24, #25, and #26.
Sure, happy to share my unrefined thoughts (sorry if I have missed pieces of the discussion!):
As for MultiSelector: Seems useful, if we wish to enable selectors pointing at multiple things. But it seems such a basic form of composition that I would hope some existing primitive can be used. Using multiple values for a selector
field would be the first option that comes to mind (in JSON this would be expressed with an array: { "source": "...", "selector": [ { ...selector1 }, { ...selector2 } ] }
). However, the annotation spec already says that "Multiple selectors should select the same content", so that plan fails. Using the same primitive as for multi-target/body annotations would make a lot of sense. Perhaps we could follow appendix D of the WA spec, or replace appendix D from that spec with whatever you come up with (if it is general enough). If a generic 'combiner' approach is not desired, inventing a specific MultiSelector seems acceptable to me too.
As for EmbeddedResourceSelector
, scope
, etc: scope
feels simply like the inverse relation of refinedBy
, and I would try to use refinedBy
instead whenever possible (perhaps always?). refinedBy
is a great primitive. In fact, it is too late now, but we could perhaps even have dropped the whole distinction between SpecificResource and Selector, replacing a SpecificResource with a ResourceSelector: {"type": "ResourceSelector", "source": "...", "refinedBy": { ...some selector... }}
. This ResourceSelector would then also be usable instead of the here proposed EmbeddedResourceSelector. Anyhow, it's too late for that, but the ability to select a resource embedded within a resource seems important.
As for extending RangeSelector, I don't see the necessity of adding a field intermediateSelectors
, but maybe I have not understood the problem. I would expect that the resources embedded in the publication, between the resources of the start and end selector, would all be selected as a whole, no?
@Treora
As for MultiSelector: Seems useful, if we wish to enable selectors pointing at multiple things. But it seems such a basic form of composition that I would hope some existing primitive can be used. […] Perhaps we could follow appendix D of the WA spec, or replace appendix D from that spec with whatever you come up with (if it is general enough). If a generic 'combiner' approach is not desired, inventing a specific MultiSelector seems acceptable to me too.
Actually, this is now a separate issue (#26); the proposal discussed there is to indeed use an array of Locators (like Appendix D). And yes, that may be a viable alternative; my problems is (to quote my own comment):
That being said, it becomes also a convenience question. The Locator's usage/implementation may be simpler if everything is one locator; otherwise we are forced to define higher level notions on how locators are used (this is not a problem for WA, but we would have to do something extra here, if only by taking over more from WA). Ie, I prefer keeping the local structure, but, at the end of the day, end users as well as implementers may have to decide...
Bottom line: this is still an open issue:-) (if possible, we should continue the discussion there, just for admin reasons)
As for EmbeddedResourceSelector, scope, etc: scope feels simply like the inverse relation of refinedBy, and I would try to use refinedBy instead whenever possible (perhaps always?). refinedBy is a great primitive. In fact, it is too late now, but we could perhaps even have dropped the whole distinction between SpecificResource and Selector, replacing a SpecificResource with a ResourceSelector: {"type": "ResourceSelector", "source": "...", "refinedBy": { ...some selector... }}. This ResourceSelector would then also be usable instead of the here proposed EmbeddedResourceSelector. Anyhow, it's too late for that, but the ability to select a resource embedded within a resource seems important.
Yes, it is too late:-(
As for extending RangeSelector, I don't see the necessity of adding a field intermediateSelectors, but maybe I have not understood the problem. I would expect that the resources embedded in the publication, between the resources of the start and end selector, would all be selected as a whole, no?
You would expect this, wouldn't you? However, that presupposes that a Web Publication always has a default reading order. But there has been quote some discussions on whether that is true or not, hence the design that does not rely on any implicit order.
That being said: yes, this is an approach to be discussed. I added a separate issue for this (#28), let us track it there!
(a side issue: the latest draft uses selectors
and not intermediateSelectors
)
The WA
scope
facility has been added; do we need when we also have the Embedded Resource Selector?