samvera / hydra-works

A ruby gem implementation of the PCDM Works domain model based on the Samvera software stack
Other
24 stars 14 forks source link

Provide a definition of scope of "Work"? #8

Closed azaroth42 closed 9 years ago

azaroth42 commented 9 years ago

My understanding is that it is a compound or complex object (eg a resource that has parts, which may themselves have parts). It is not the bibliographic notion of an abstract Work (as opposed to a physical Item that embodies the Work).

It would be good to come to a common understanding of the definition and thus scope of the effort before starting in on modeling.

jcoyne commented 9 years ago

:+1: to Generic* following @escowles diagram.

dchandekstark commented 9 years ago

I'm willing to :+1: @escowles's Generic* terminology. Haven't heard anything better in this forum.

azaroth42 commented 9 years ago

:+1: and Stanford are in the same boat -- we want to converge with the rest of the community, but not at the expense of what we consider significant functionality.

mjgiarlo commented 9 years ago

:+1: to the model proposed in @escowles diagram, and :+1: to Generic* terminology (though I'm not sure whether it should be GenericWork or GenericObject -- I'm happy to leave that be since Worthwhile already uses the term Work and the Sufia discussion revolved around the same term).

mjgiarlo commented 9 years ago

(And yes, we will use Sufia/Worthwhile at Penn State and map our simple self-deposit content into this model!)

awead commented 9 years ago

Starting to see lots of :+1: after the epic Twitter-fest earlier yesterday.

"Generic is the new Active" -- @jpstroop

jpstroop commented 9 years ago

For the record (such as it is), said Tweet-fest is here.

mcritchlow commented 9 years ago

is that documentation too? ;)

mjgiarlo commented 9 years ago

@mcritchlow Did you submit your iCLAs to Twitter yet? Just DM them to @ev.

escowles commented 9 years ago

CLA or no, I've updated the diagram over in #11.

mjgiarlo commented 9 years ago

:clap: @escowles

blimey74 commented 9 years ago

How about 'artefact' or 'digital artefact'? Definition: "an object made by a human being, typically one of cultural or historical interest."

ronan-mch commented 9 years ago

Hi, apologies I don't have time to go into this further, but I thought I should mention that at KB we are working with a distinction between Works and Instances as per Bibframe, whereby a Work is a resource reflecting a conceptual essence of the cataloging resource while an Instance is a a resource reflecting an individual, material embodiment of the Work. There are a lot of interesting discussions about when something constitutes a new Work versus a new Instance of the same Work, but in general we find the distinction useful. More importantly we find the level of documentation extremely useful as it saves us having to figure out things ourselves.

escowles commented 9 years ago

@ronan-mch I think that the Work/Instance distinction fits well into the model we're coming up with. Your Works map to GenericWork and your Instances map to GenericComponents.

ronan-mch commented 9 years ago

@escowles Yes, perhaps. On the other hand, as we see it Works cannot have files attached to them, as Works are purely conceptual entities. Files will invariably belong to an Instance.

escowles commented 9 years ago

@ronan-mch Right -- your local application would have to prevent attaching Files to Works, or you could extend Component to create your own Work class that didn't allow it.

It's probably helpful to step back to the primary goal, which is to have a flexible framework that can handle everyone's use cases, without getting too cumbersome. Once that framework is in place, we can update our tools to use it, and the hope is that each site can extend the basic models in a compatible way that makes them portable between tools.

Since there are so many different conceptions of how to model these things, many people will have situations where the model doesn't fit their local use cases perfectly. For example at UCSD, we allow Works to have either Files or Components, but not both. So if we want to keep that restriction, we'll need to enforce it in our app. That's an easy tradeoff to make to work towards more flexibility and portability.

ronan-mch commented 9 years ago

Hi @escowles - thanks for the clarification.

grosscol commented 9 years ago

It was pointed out by @jeremyf and @escowles that programmers will be the primary users of these names. However, it is worth noting that some terminology used in the code usually ends up percolating through to the non-programmer discussions where the concept of namespaces might not be as commonly used.

In short, it might be prudent to avoid overloading terms that are already being used by the non-programmers who have interest in developing projects using the hydra-works model.

jwestgard commented 9 years ago

I agree with what @declanfleming said above, and find 'digital object' to be the most useful term for communicating with non-programmers about these issues. There's no reason that cannot continue to be the term of art in those discussions.

At the same time, I think 'object' has too many other uses/connotations to be really useful in describing the innards of Fedora/Hydra, so I think the made up words or the compounds with 'generic' are probably best, though I have to say 'generic' seems pretty heavily laden with various meanings too, especially here in library land. That said, if there's already a gathering consensus around 'generic', so be it.

jeremyf commented 9 years ago

@grosscol A glossary of the class names we use will help solve the communication barrier.

DigitalObject does nothing for me. It is generic and redundant. I'm writing software, ergo Digital is implied. Which leaves Object. And what do we mean by that.

However, if we apply a namespace Hydra::Works::DigitalObject I have greater context. I still believe the DigitalObject may be too nebulous. Though @escowles's diagram may indicate that DigitalObject is the correct thing.

I also believe, just as important to naming the thing is to talk about what we will be doing with the thing (below is a hasty straw dog for example purposes only):

A Digital Object is:

A Digital File is:

mjgiarlo commented 9 years ago

If folks like, or at least can live with, the Generic* terminology and there's some hesitation about the notion of works, what about s/GenericWork/GenericObject/ ?

I would also encourage everyone who's interested in the current scoping and naming discussion to contribute a use case! Those will ultimately have more impact, and be more important, in shaping the work we do on this than what we name our underlying model classes. (To be clear: this is not a passive-aggressive attempt to shutdown a good discussion, but rather a "yes, AND" statement. It's good to debate naming and it's better to do that and contribute a use case or two.)

escowles commented 9 years ago

Despite using "Object" locally, I think the term "Work" is far more clear. Object is just too overloaded between the general programming sense, the modeling sense, and the closely-related Fedora sense. I think as we head into wordsmithing the definitions, it makes sense to move that over to the wiki, so I've started with @jeremyf's straw dog and expanded it a bit:

https://wiki.duraspace.org/display/hydra/Hydra%3A%3AWorks+Shared+Modeling

azaroth42 commented 9 years ago

:+1: to the expanded dog.

jeremyf commented 9 years ago

@escowles Yes. That!

I also want to make sure that we are talking about their interfaces; And creating interface style classes for those concepts; Avoid module mixins would be my goal, and create SimpleDelegate objects.

mjgiarlo commented 9 years ago

@jeremyf I like the idea of applying some of the lessons you have been learning via e.g. Hydramata, and talking about at Hydra Connect, in this work. I'm not sure many others of us have internalized these lessons, though, so while I'm all fired up about it, this push may have to come from pull requests with your name on them until we've had a chance to come up to speed. Heck, even if you comment on the occasional pull request about how we might apply some of these patterns, that'd be a good start. Thanks again, Jeremy.

rickjohnson commented 9 years ago

As a former programmer straddling the two worlds, I agree that DigitalObject is too generic. @jeremyf talks all the time about code being self documenting so :-1: to Hydra::Wortem. It is more useful to pick a real word that someone can attach real world context to (even if there is more than one possible definition). In my mind, our conceptualization of a work in Curate matches up most with @ronan-mch definition above where a work does not actually need any files attached and could just be metadata with links to the resources elsewhere.

mjgiarlo commented 9 years ago

:+1: to the models and attributes as expressed on the wiki page @escowles created: https://wiki.duraspace.org/display/hydra/Hydra%3A%3AWorks+Shared+Modeling

mjgiarlo commented 9 years ago

@rickjohnson Excellent. Thanks!

escowles commented 9 years ago

I think there are two concepts in the straw dog that should be collapsed: AttachmentReady (can have files) and Subdivisible (can have components). GenericWork and GenericComponent have both attributes, so I think it basically boils down to "can have files and/or components attached".

On the other side, GenericComponent and GenericFile would have the corresponding Attachable property (must be attached to a GenericWork or GenericComponent).

azaroth42 commented 9 years ago

:-1: to collapsing them as they're valuable separately in higher level models. For example ThumbnailedCollection might be AttachmentReady and not Subdivisible.

ronan-mch commented 9 years ago

In case anyone is interested, this is how we are actually implementing the distinction between Works and Instances. We keep the Bibframe specific logic separated as namespaced modules, which are then mixed in by the models. There are a few details missing, but this is broadly accurate. I guess if there were Hydra models that were broadly compatible with what we wanted, we could use these and mix in our Bibframe modules. It becomes interesting when we can share these modules with others in the community who also want to use the Bibframe data model.

hel_class_overview

awead commented 9 years ago

:+1: to @escowles wiki page descriptions. I'm reading that GerericWork, for example, can be subdivisible and can have attachments, but doesn't have to. Wouldn't that allot for @ronan-mch needs and @azaroth42's case of thumbnailed collections?

mjgiarlo commented 9 years ago

@ronan-mch Interesting -- and based on a summary glance, I don't think your model deviates from the @escowles model.

azaroth42 commented 9 years ago

@awead It doesn't account for mine, as Work != Collection.

cjcolvar commented 9 years ago

@escowles @jeremyf I'm looking at #27 with respect to the model on the wiki. Would this work in the case of one audio file for a CD and "tracks"/components as just regions of that file? In other words, can a GenericFile be attached to two AttachmentReady objects (Work or Component)?

escowles commented 9 years ago

@cjcolvar The model doesn't really say right now -- both Work and Component can link to Files, but doesn't say if they could be the same File instance or not. The one audio file with multiple track Components does seem like a good use case for allowing them to do so.

I'm curious about how the tracks are specified -- is there a standard way of expressing what portion of the File belongs to a track?

ronan-mch commented 9 years ago

Just to note that @cjcolvar's use case is similar to one that we have: We have a book called Correspondence with Tove Ditlevsen, which consists of reprints of personal letters sent between this famous Danish author and her lover. So we have a book Work which has an Instance, in which several other Works are found, i.e. each letter is a Work in its own right. This letter Work thus has one or more Instances, one of which is a specific section within the greater book Instance. In terms of content we can represent this via an anchor to the corresponding div in the same TEI file, but so far I'm not quite sure how to represent it as a relationship. Is the letter's Instance the same as the book Instance, or should there be some specialised relationship hasInstanceInInstance or something?

jeremyf commented 9 years ago

I believe we can call this closed. Let it be henceforth known as PCDM