IIIF / api

Source for API and model specifications documents (api and model)
http://iiif.io/api
107 stars 54 forks source link

Collection / Range membership #716

Closed azaroth42 closed 8 years ago

azaroth42 commented 8 years ago

There are requirements (see #697 and #646) to order collections and manifests within a collection, and ranges and canvases within a range. Currently they exist only in two separate, ordered lists, making interleaving impossible.

The proposed solution is to allow a members property that allows either, with the additional restriction that @id, @type, and label MUST be present and viewingHint SHOULD be present for collections and manifests.

If a client sees a members property, then it should use it even if the regular properties exist. Publishing systems should be aware that if they use members and do not provide a fall back for the split properties, then 2.0 clients will not produce the expected results.

The split properties would be deprecated in 3.0

jpstroop commented 8 years ago

The split properties would be deprecated in 3.0

Do you mean that the split properties would be deprecated in 2.1 and removed in 3.0?

sdellis commented 8 years ago

@azaroth42 I disagree with your summary of the two issues: There are NOT requirements to order collections and manifests within a collection, or to order ranges and canvases within a range.

There are use cases which require 1) a way to model multi-volume works and 2) a way to model the order of logical content among content that fits into no logical grouping.

There are at present two different solutions proposed (at least for the first use case). In the issues cited above, I have supplied several arguments why it would be wise to carry on with the model we use in 2.0. I would be happy to restate them here if necessary. I have heard no valid counter arguments from the alternate ("members") solution, and I can only presume that introducing the members property is the preferred solution among the editors because implementation work has already started down a path that is still in draft and not yet ratified.

Is this a correct assessment? If not, are there counter arguments that have not been stated in the issue dialogue that can be shared? I also wonder if it is wise to conflate these two use cases since there are some key differences, among the acknowledged similarities.

azaroth42 commented 8 years ago

@sdellis We have an accepted solution for multi-volume works already, and if the content fits into no logical grouping, then a grouping construct such as ranges or collections is not going to help.

Your proposal doesn't address #646, and adding collections or manifests to manifests blurs any reasonable distinction between Collection, Manifest and Range. By doing that, we end up with Object hasMember Object ; Object hasView Canvas and that's all. Given experience with ORE, PCDM, Hydra, and similar, we know this to very quickly need further specification into sub classes. At which point we end up back at the same state as we are now.

So no, I don't think it's a correct assessment. The members proposal is technically more appropriate and easier to implement more consistently.

sdellis commented 8 years ago

Apologies for continuing to pursue this, as I was not aware that there was already an accepted solution. The only acceptance process/criteria I'm aware of is what's stated in the Editorial Process document.

My intent has been to provide feedback from the viewpoint of a client-side implementer of the Presentation API who has cursory familiarity with PCDM. As there is no one currently representing that perspective on the editorial board (yet client-side implementers seem to be the intended audience), I hope my feedback has been somewhat valuable.

I'd also like to note that this is the second instance in which I have heard PCDM specification choices and terminology used as supporting justification for IIIF specification choices. I don't think there's any information on the IIIF site stating this relationship, and to invoke the PCDM model as an appeal to authority in discussions without explicitly stating a relationship does not encourage community participation from outside the PCDM/Hydra communities.

azaroth42 commented 8 years ago

Right, and that's the process that we went through for multi-part collections :) In 2.1, there's already the viewing hint:

“multi-part”: Valid only for collections. Collections with this hint consist of multiple manifests that each form part of a logical whole. Clients might render the collection as a table of contents, rather than with thumbnails. Examples include multi-volume books or a set of journal issues or other serials.

Understanding PCDM (or ORE or anything else) isn't a requirement, but of course past experiences of what has worked and what hasn't informs decision making. I would say that we're lucky to have @tomcrane on the editorial board who can clearly represent the Universal Viewer requirements from an interoperability perspective. Interoperability between systems is the key aspect of these discussions and likely leads to more server side and architectural folk being involved than client side folk, as one server needs to cater to multiple clients.

Hope that helps

sdellis commented 8 years ago

There are many ways to achieve interoperability. I would also say that adoption should also be a goal as interoperability is nothing without widespread adoption.

I would also say that multiple servers catering to multiple clients would be the most interoperable goal, not one server (that is assumed to be implementing PCDM) catering to multiple clients. I respect your experience, but past experience can also create implicit bias. If PCDM solutions were directly transferable, there would be no need to create a Presentation API as you could simply serialize PCDM. The Presentation API requires a different level of abstraction, and abstractions that work for PCDM are not necessarily appropriate in this context.

To further my point, I had another proposal (on the list) that was rejected on the basis that there is an unstated data model (PCDM) that drives this API. My argument was against Polymorphism (see chapter in Your API Is Bad for description), where sometimes the value is a string and sometimes it's an object or an array. Again, no counter argument was made other than "the underlying data model dictates," even though it creates a burden for implementers.

This is all to say that I see this approach to resolving debates as flawed and dangerous to adoption and optimal user/developer experience.

zimeon commented 8 years ago

The Shared Canvas data model is the only model underlying the IIIF Presentation API, as stated in http://iiif.io/api/presentation/2.0/#introduction. The PCDM is not and should not be a driver.

sdellis commented 8 years ago

Thanks, @zimeon, for that clarification.

So, how does the board choose when or when not to diverge from Shared Canvas as a data model? For example, Ranges in SC don't contain other Ranges in SC as they do in IIIF. From an outside perspective it seems arbitrary to shoot down one idea because it goes against the data model when there are several instances in which the IIIF Prezi API has already diverged from the data model. Furthermore, SC says nothing about Collections, and it's hard to draw inspiration from the ORE model it points to, which has a rather complicated model for building ordered aggregations (like members). Finally, is the Shared Canvas data model implemented anywhere else besides IIIF, and if so, does the evolution of both specs independently create complications for their ongoing relationship?

At any rate, I believe that diversity of perspectives leads to better, mutually beneficial outcomes. Experience in other areas beyond architecture, particularly in front-end API implementation, should have value as well (@tomcrane's contributions are enormous, but one perspective is not "diverse"). With no term limits, and a policy of self-selection, the editorial board has the advantage of paying lip-service to "community", without having to represent community needs at all (or being selective about them). An approach like this can paint an unwanted, "ivory tower" picture that a more democratic process to editorial representation would help to avoid. This seems to have become more of a list issue, not necessarily part of this GitHub Issue, so I am fine with moving it over there if it warrants further discussion.

tpendragon commented 8 years ago

I'm :+1: with this solution, especially so long as the point here:

If a client sees a members property, then it should use it even if the regular properties exist. Publishing systems should be aware that if they use members and do not provide a fall back for the split properties, then 2.0 clients will not produce the expected results.

is published. That way I can, as an authoring client, duplicate manifests between the "members" and "manifests" property and be assured they'll work with conforming clients.

@sdellis I'm pretty new to this community, so I can't really speak to the history of decisions here, but it seems like those points would be better made on the list to be discussed. I'm going to try to summarize the two options and the points as I've read them, make an inline comment or two with my opinion, and please feel free to fill me in on what I've missed (it's been a long couple of threads, and it's Monday.)

Proposed Solution (in this thread):

// Collection with a MVW and a Book
{
  "@type": "sc:Collection",
  "members": [
    { "@id": "1",
       "@type": "sc:Collection",
       "label": "MVW"
    },
    { "@id": "2",
       "@type": "sc:Manifest",
       "label": "Book"
    }
  ]
}

So now collections can order manifests and collections with regard to one another, and manifests don't change. Manifests can still only have sequences.

@sdellis solution:

// Collection with a MVW and a Book
{
  "@type": "sc:Collection",
  "manifests": [
    { "@id": "1",
       "@type": "sc:Manifest",
       "label": "MVW"
    },
    { "@id": "2",
       "@type": "sc:Manifest",
       "label": "Book"
    }
  ]
}

//MVW
{
  "@type": "sc:Manifest",
  "@id": "2",
    "manifests": [
    ]
  }
}

Collections don't change (in structure at least - in meaning they do. They're now not the only place to look for manifests.) Manifests change in that they can have child manifests. This'd probably be more work to implement for clients, but I suppose it depends on how their code is structured (they'd have to dig recursively into manifests as well as collections now - effectively a manifest IS a collection, except it can have sequences and ranges too!)

Option 1 seems like a nicer solution because it keeps the meaning of both collections and manifests distinctly separate and easier to think through - do I have children? Yes? Then it's a collection. With Option 2, why would I ever pick a Collection when everything fits into a Manifest? They'd probably render differently - so I'd have to care about that distinction (as an author building my manifests), whereas now I don't - I just put stuff where it fits.

sdellis commented 8 years ago

Actually in "Option 1" Collections do change both in structure and meaning (if we are talking about 2.0 as a baseline): they impose a logical order on their children rather than just aggregate. And this seems simple:

do I have children? Yes? Then it's a collection.

... until a viewing hint tells you to render it like a manifest, not a collection. That's a code smell and an abstraction fail. "Viewing hints" should be used for rendering edge cases that break the convention, but I don't see MVWs as edge cases and it makes more sense to treat them as "works" that up until now have been rendered as Manifests. (If you want to treat a MVW like a manifest, why not just make it a manifest?) And to say that Manifests are Collections just because they can contain other manifests is simply not true no matter how many times it gets repeated. Manifests would still not have children collections (in addition to the other differences you pointed out), and just because two resources use recursion doesn't mean they are the same thing.

Furthermore, to maintain compliance with older versions (the most interoperable approach), you are correct in that you have to repeat every collection and manifest in a collection in both members and the split lists, doubling the size of the payload, which I don't see as an acceptable consequence (among many) of a breaking change like Option 1.

As for your question re: "Option 2", you would pick a Collection like you do now -- whenever you have a group of things of arbitrary order and/or a group of groups with an arbitrary order. The only time you would have a manifest with child manifests is for multi-volume works. If you see children manifests, you know right away that it's a multi-volume work, and you only have to "look for manifests" here when you need to render that work.

tpendragon commented 8 years ago

if we are talking about 2.0 as a baseline): they impose a logical order on their children rather than just aggregate

Sorry, I don't think this is right. Looking at the context document for 2.0, http://iiif.io/api/presentation/2/context.json, "manifests" is an RDF:List, meaning their order is asserted and important.

until a viewing hint tells you to render it like a manifest, not a collection. That's a code smell and an abstraction fail.

"Abstraction fail" feels like rough language, but I realize you're quoting an earlier source. This is a good point, and I'm not sure how to resolve it.

Manifests would still not have children collections

Also a good point.

As for your question re: "Option 2", you would pick a Collection like you do now -- whenever you have a group of things of arbitrary order and/or a group of groups with an arbitrary order.

I don't think we can use order as the delineating factor, because Manifests are ordered within collections right now.

So it seems like I either have to think about the fact that it's multi-part (and add the viewing hint), or think about the fact that it's not a physical medium (although, is a manifest which just has manifests a physical medium? I don't think so - does it matter?) and make it a Collection or Manifest appropriately. So, I see the following:

Option 1:

Pros:

Cons:

Question: Is there a path problem with either of these? Does IIIF explicitly recommend /manifest and /collection as the URL for instance? If so, then a pro would be the URI wouldn't change whether it was a MVW or not, if you switched between the two. Then again, maybe it SHOULD change.

Option 2:

Pros:

Cons:

Is that right?

tpendragon commented 8 years ago

Separate smaller issue for this proposal:

@azaroth42 @jpstroop What do you do to resolve these conflicting statements in the case of merging manifests and collections?

Collection objects may be embedded inline within other collection objects

manifests must not be embedded within collections

jpstroop commented 8 years ago

Viewing hints" should be used for rendering edge cases that break the convention

Where do we say that? As a Presentation API, I think viewing hints are fundamental to the spec and extend well beyond edge cases.

Stepping back, adding yet another recursive property to the mix opens too much potential to do the same thing in multiple ways, I think. To me there's a nice implication that there are essentially three tiers of structure an application can expect to encounter:

Of course this is not written into the standard anywhere, but I believe it reflects how most implementations have used the API to date. We have recursion where it is needed--at the two logical tiers, and do not have it where it isn't.

Regardless, right now there is no way to show the order of things in a collection that mixes Manifests and Collections (whether its a less-controversial sub-collection or an MVW) and needs to retain order. To me that's the level at which we should be discussing this problem, and we don't solve it with recursive Manifests. Having two different to these very similar problems would complicate things more than necessary.

jpstroop commented 8 years ago

@tpendragon

manifests must not be embedded within collections

Just means that they can't be serialized within the collection; their URIs should just be referenced. Is that what you mean?

tpendragon commented 8 years ago

Just means that they can't be serialized within the collection; their URIs should just be referenced. Is that what you mean?

Yeah, but collections can. So do you keep the statement and make it based on the type rather than the key they're put in? Maybe it was always meant to be that way.

tpendragon commented 8 years ago

Regardless, right now there is no way to show the order of things in a collection that mixes Manifests and Collections (whether its a less-controversial sub-collection or an MVW) and needs to retain order.

Ah, yes, this is the other pro for option 1. I realize IIIF doesn't deal in IFs, and rather in solid use cases, but this leaves the door open for ordering manifests/sub-collections together in cases where it's not MVW. If you implement Option 2, and a use case comes up for that, then you'll end up implementing option 1 on top of it.

azaroth42 commented 8 years ago

And also has the same consistent model for Ranges with included Ranges and Canvases, as per the issue. Again, that would need to be solved in a different way, compared to one consistent and coherent approach.

The duplicate data is the same in Option 2 -- just that there's also no fallback solution for 2.0. You would only use members in 2.X iff you need to order both collections and manifests together (or ranges and canvases). So duplicate really is a pro ... there's a fallback, rather than having to process a new structure without any possibility of a 2.0 client knowing what to do.

And viewingHints are not a hacky work around, they're quite intentional and working as expected. Creating a new one to govern how to view a particular collection is exactly what they're for. So the "Con" of creating a new one, at the expense of requiring more structure and processing of that structure, is inconsequential. Think of them as subClasses, without requiring inferencing to fallback to the parent class.

azaroth42 commented 8 years ago

The three tiers were more obvious in 1.0, and we abstracted away from them slightly as they're only relevant to born-physical-then-digitized content. But if a digital-only comic book or a slideshow could be considered somehow "physical", then yes. Something like:

The bound multi-thing "thing" is the only outlier here, where the consistency with arrangement is more valuable than the consistency with the abstract categorization. A client will naively do a better job considering a multi-thing-thing as a collection (and then allowing the user to drill down into the individual things) than as a thing itself (and not knowing what to do with the subcomponents).

azaroth42 commented 8 years ago

@sdellis This isn't a democracy, but if it were, you would still be outvoted on the order of 7 to 1. Unless you want to start a SuperPAC to raise funds to campaign for your solution... ? :)

sdellis commented 8 years ago

Thank you all for clarifying a few things for me, including "thing". :) Most pain points come down to linguistic ambiguity, and getting to the unstated definitions of what we are talking about makes a big difference in how one interprets the spec. Keep in mind that many developers will not be as persistent in trying to understand these distinctions, which is what I'm talking about when I ask that developer experience and adoption be considered.

I now know:

... that a Collection is an intellectual arrangement of "things". If I want an alternate arrangement of the same things, I need to create a new, different Collection.

... that a Manifest is a "thing" that may or may not be physical, and that "thing-ness" is generally determined by whether or not it has a binding (except for non-physical things). I was erroneously thinking of Manifests in the FRBR manifestation sense: "the physical embodiment of an expression of a work" (hence Multi-Volume Works naturally seemed to be manifests to me).

... that viewingHints are essentially subClasses.

... that there are no unordered lists in the spec!

Language is important and many of these terms are overloaded, so I would suggest putting all the above in the spec (viewingHints was my bad, but the analogy is helpful). Defining them clearly is only going to ease implementation pain points going forward. For example, web developers who do not need to know RDF should not have to parse the context document (and beyond) to learn that there is no such thing as an unordered list here.

@azaroth42 , I am well aware this is not a democracy, and I'm certainly not interested in running for office! :) My concern is not about whether Option 1 or Option 2 "wins", but the time it took to reconcile our mental models because I could not find answers that should be in the spec (perhaps they will be now?). We should not have to work this hard to communicate. I am just a web developer (with 20 years experience) trying to implement this in my free time without any a priori knowledge of where these concepts are coming from. My goal is to work towards interoperability AND adoption. I will offer praise when these are both accounted for and dissent when they are not.

Carry on. :+1:

sdellis commented 8 years ago

I guess my one final concern with mixed resources is that it complicates what was once an obvious path to a RESTful interface (i.e., http://some-domain/some-collection/manifests) for IIIF Presentations. REST emphasizes an uniform interface between components, whereas members seems to break that forcing you to deal with different types of objects when you want to do anything with http://some-domain/some-collection/members. I haven't heard much about REST and this spec, but I assume the ultimate intent is to be RESTful (correct me if I'm wrong).

jpstroop commented 8 years ago

We do explicitly call out REST as the intended service pattern/protocol.

If you're trying to guess URIs, then yes, the recommended (non-normative, not RECOMMENDED) URI patterns that appear at the start of each section (e.g. Collections) could trip you up, but URIs are opaque, even in a RESTful paradigm. If you follow the one you have and treat the payload you receive you'll be just fine. A predictable URI is nice, but it's also a dangerous pattern. RFC-6919 :smile: .

zimeon commented 8 years ago

Further to @jpstroop's comment, we should be careful not to conflate REST and HATEOAS with nice URI patterns. I think so far the Presentation API follows both of these paradigms very well, unfortunately the need to pass many parameters in the Image API excludes a truly HATEOAS approach. We have open issues (#21 and #40) for discussion of the use of other HTTP verbs to flesh out RESTful CRUD functionality in the APIs.