hybox / models

Data Modeling repository for HyBox (ontologies, vocabularies, best practices, requirements, etc)
Apache License 2.0
5 stars 3 forks source link

Distinguish Physical from Digital in Model? #12

Closed azaroth42 closed 8 years ago

azaroth42 commented 8 years ago

@hannahfrost @no-reply @mjgiarlo @ggeisler (cc @jpstroop)

Is it important to distinguish between the physical object that has been digitized, such as the canonical postcard, from the abstract digital representation of it? In other words, does the pcdm:Object that represents "the postcard" represent a physical object that can be touched, or does it represent the collection of digital content that is a surrogate for it, or does it represent both at the same time? This will apply for any content that is not born-digital, and potentially even then, if a thesis is BD and printed, it may be important to distinguish between the submission of the electronic copy from the submission of a physical copy.

Some further questions to try and help answer this:

Basically ... are there situations that it is important to accurately describe where the same property would be ambiguous if the physical and digital copies were not distinguished?

azaroth42 commented 8 years ago

For example, in Scenario 1a (https://docs.google.com/document/d/1nWlUriv0tp3t8asmuoC_VKph5Rwa-shwnntExRlArZo/edit#heading=h.4hsb89ljc0ye) all of the features that Roberta is describing are of the physical object.

To me, identifier is the critical one: The identifier for the digital object is a URI. The identifier for the physical object will be some local cataloging number. However the description of the physical object will have a URI.

Thus:

{
  "id": "http://hybox.repo.org/objects/1",
  "type": "bf:Work",
  "identifiedBy": {
    "type": "LocalIdentifier",
    "value": "postcard-5"
  },
  "hasInstance": {
    "id": "http://hybox.repo.org/objects/2",
    "type": ["pcdm:Object", "bf:Instance"],
    "identifiedBy": {
      "type": "LocalIdentifier2",
      "value": "mysql://postcards/1"
    },
    "files": ["/files/1.jpg", "/files/2.jpg"]
  }
}

The entity that has the identifier "postcard-5" does not have files. The entity that has files could also have another identity, which might not be a URI, if it was migrated from a previous system.

azaroth42 commented 8 years ago

The other factor that is convincing for me (at least) is the potential for multiple digital copies that are explicitly derived from a single physical object. If you want to maintain both digital copies, and they should be tied to the same physical object, then there must be a separate entry in the system for the physical object from each of the two digitized versions.

{
  "id": "http://hybox.repo.org/objects/1/",
  "type": "bf:Work",
  "created": "1990-11-20",
  "hasInstance": [
    {
      "id": "http://hybox.repo.org/objects/2",
      "type": "pcdm:Object",
      "created": "2015-03-10"
    },
    {
      "id": "http://hybox.repo.org/objects/201231",
      "type": "pcdm:Object",
      "created": "2019-06-20"
    }
  ]
}
ggeisler commented 8 years ago

I'm not sure I'm the one to provide a firm answer, but I will point out that more than one of our interviewees mentioned wanting a system that could manage both physical and digital objects. So that, and the examples @azaroth42 provided above, lead me to think we do want to distinguish between physical and digital objects.

hannahfrost commented 8 years ago

+1 for the ability to distinguish physical and digital objects.

WRT to the questions above I say:

If that's not enough to close the issue, then I'll need clarification on these questions:

cc @guegueng

azaroth42 commented 8 years ago

I think it's enough to close :) Another potential requirement that came to me -- the physical object might be in a physical collection (with a digital description), and the digital object not in that physical collection but in others (either physical or digital). Without distinguishing the two, that would be impossible.

mjgiarlo commented 8 years ago

I'm wondering if CONTENTdm users, for example, are used to making this distinction between physical and digital objects. If not, I might be concerned that we're coming up with a model that confuses our targeted users, though it's worth noting that there's the model and there's the UI/UX around the model, where the latter can do wonders to translate a potentially confusing or complex model into something that users can find enjoyable.

no-reply commented 8 years ago

I'm :+1: on closing this & opening a new ticket to tackle this set of requirements.

I'd like to call out (echoing @mjgiarlo's comment, which came in while I was writing this) that such a model would ideally be flexible enough that the presence of "real world"/"physical" objects isn't a burden to users interested in simple descriptions of digital resources.

Supporting "real world" objects seems more attractive as a well supported extension to a model than as a requirement.

mjgiarlo commented 8 years ago

I'm down with that, @no-reply.

hannahfrost commented 8 years ago

:+1: @no-reply . I didn't mean to imply that surfacing all of this in HyBox was a requirement. Just that the model should support it.

mjgiarlo commented 8 years ago

:clap:

no-reply commented 8 years ago

Answering one of @hannahfrost's Q's mainly as an academic exercise:

I don't think that the multiple physical / multiple digital copies situation alters this requirement, but I'm a little fuzzy on that scenario.

Mainly, the cardinality of the relationship between them affects the model. If you can assume a one-to-one relationship, collapsing the physical and digital objects is more viable. It may still lead to semantic overload and confusing predicates, but it's serviceable.

no-reply commented 8 years ago

Closing per https://github.com/hybox/models/issues/12#issuecomment-195098747.

Opened #13.

jpstroop commented 8 years ago

Since I was cc'd, and to perhaps state the obvious....

The end-user generally cares about the creator, date, etc. (descriptive metadata) of the physical thing and not, e.g, the date it was digitized. It's a surrogate for the Real World Object.

The repo manager probably does care about the distinction in the form of technical and provenance (preservation) metadata. So yes, you need some distinction, but to have to object model distinguish between the physical and digital could easily get out of hand in some Range 14 or FRBR (choose your plane of madness) kind of way.

I think PCDM gets you enough to make the distinctions you need to make: desc metadata about the Object as though it were the real thing, and tech and prov metadata about the files. I'd find it unnecessarily difficult to make any more distinction than that.

-Js

Sent via mobile. Please excuse typos, brevity, etc. On Mar 10, 2016 6:27 PM, "Michael J. Giarlo" notifications@github.com wrote:

I'm wondering if CONTENTdm users, for example, are used to making this distinction between physical and digital objects. If not, I might be concerned that we're coming up with a model that confuses our targeted users, though it's worth noting that there's the model and there's the UI/UX around the model, where the latter can do wonders to translate a potentially confusing or complex model into something that users can find enjoyable.

— Reply to this email directly or view it on GitHub https://github.com/hybox/models/issues/12#issuecomment-195096300.

jpstroop commented 8 years ago

... and the world moved on while I typed away with my thumbs.

-Js

Sent via mobile. Please excuse typos, brevity, etc.

azaroth42 commented 8 years ago

though it's worth noting that there's the model and there's the UI/UX around the model,

This. There can still be a single web form with text boxes and dropdowns, they'll just populate triples with multiple subjects :)