dcmi / usage

DCMI Usage Board - meeting record and decisions
8 stars 5 forks source link

DCMI class addition Proposal: "Digital3DResource" - Justification & Rationale #105

Closed AdamRountrey closed 1 year ago

AdamRountrey commented 2 years ago

I am posting this on behalf of the Audubon Core 3D Imagery and Data Task Group:

We propose a term addition to the DCMI type vocabulary, to be used as a value for dc:type/dcterms:type. Specifically, the term “Digital3DResource” is needed as 3D resources are a targeted search class that does not clearly fit under a single existing category definition (e.g., “dataset”, “image”, or “interactiveResource”). This leads to uncertainty when assigning a term and difficulty when trying to locate 3D resources.

Term Name: Digital3DResource URI: http://purl.org/dc/dcmitype/Digital3DResource Label: Digital 3D Resource Definition: A binary file or collection of files primarily intended to hold information about the three-dimensional geometry (surface or volume) of a real or non-real object, set of objects, or scene. Comment: Such files can be used by software to digitally render views of the subject, make measurements, conduct analyses, and create physical 3D replicas. A file or collection of files that are intended to be used to compute a three dimensional geometry (e.g., X-ray projections for computed tomography scans or photograph sets for photogrammetry) are also included. For avoidance of doubt, 2D renderings (views) produced from a Digital3DResource should not be included in this class, but stereo image pairs, anaglyphs, and other formats that hold information about 3D geometry may be included. Type of Term: Class Member Of: http://purl.org/dc/terms/DCMIType

DISCUSSION:

Why not use an existing DCMI type vocabulary term or informally use a non-DCMI term?

Existing values from the DCMI type vocabulary for dc:type do not allow adequate categorization and description of digital 3D resources. The proposed terms are also contrasted with the definition of dc:format, which is similarly limited. Our justification for adding new fields to more easily distinguish among the variety of 3D resource types is as follows:

  1. Two of the DCMI Type Vocabulary terms used in dc:type, “Image” and “Dataset”, fit some, but not all, 3D resource types. a. The definition of “Image” (http://purl.org/dc/dcmitype/Image) allows for a broad range of “visual representations” currently, and this could be interpreted to include 3D resources, such as meshes and volume data. A creator may understand a 3D model of a real physical object as a kind of “image”, but this interpretation could vary widely in the community. In addition, although interaction with 3D resources is often visual, they also contain geometric and structural data that is used in non-visual ways. b. The definition of “Dataset” (http://purl.org/dc/dcmitype/Dataset) may be an appropriate category for certain scientific and technical 3D resources (e.g., CT scans of museum specimens or LIDAR scans of historical buildings), but many types of 3D resources are more expressive, artistic, or interpretative in nature and would not rightly be called “datasets” or expressions of facts. “Dataset” is also too broad a term to allow users searching specifically for 3D resources to effectively discover them.
  2. Informally using “Digital3dResource” as an unsanctioned term in dc:type within our limited community not be compatible with general purpose repositories that limit dc:type entries to the controlled vocabulary. In practice, this means that using the term informally will not allow for an increase in interoperability or discoverability through aggregation or shared indexing. Since increasing interoperability and discoverability is the primary goal of adding the term, the inability for it to aid in enriching aggregator data bases contra-indicates this solution.
  3. The definition for dc:format (http://purl.org/dc/elements/1.1/format) and controlled vocabulary within Audubon Core (https://ac.tdwg.org/format/) capture the file extension, which does not always reflect the encoding of a file’s contents in a technical or more qualitative sense. For example, a ZIP file may contain a CT dataset or Photogrammetry image file set. The issue is common to video and audio file formats as well – e.g., the video content in an MP4 file needs to be encoded/decoded using one of a variety of codecs: h.264, MPEG-4, Apple ProRes 422, etc.

@magpiedin, @baskaufs

kcoyle commented 2 years ago

It looks to me that your Digital3dResource is a narrower type of dcmitype:Dataset. Adding a separate Digital3dResource type would lose that semantic relationship. Creating a class of Digital3dResource that is subclassed to dcmitype:Dataset would retain that.

This doesn't mean that I am advocating for a dcmitype:Digital3dResource that is subclassed to dcmitype:Dataset. I would actually prefer that the Digital3dResource class be defined outside of dcmitype because I think it opens up a can of worms:

  1. I don't think that we can address all of the sub-types that could be needed
  2. I don't think that there is any one taxonomy of sub-types that serves everyone's needs.
  3. I also think this is complicated by the fact that dataset type is not always the same as dataset usage - so you may have a dataset that is a list of auto parts and a dataset that is a list of citations and a dataset that generates a Digital3dResource. All of these have different potential uses and I'm not sure that "type" covers this.
sruehle commented 2 years ago

I'm sorry, but I would rather vote against this proposal:

  1. The term is very specific and, in my opinion, doesn't quite fit into the generic approach of the DCMI type vocabulary.
  2. The way it is defined, it's primarily about "files". These can only be interpreted as 3D objects through various applications. Therefore, the object type Digital3DResource somehow doesn't seem right to me. Like Karen, I'm not sure this is even an object type.
  3. In my opinion, such specific proposals from the community only raise expectations that we cannot fulfil. Rather, we should see if there are other vocabularies that address this issue and, if necessary, support the community in providing persistent identifiers for these vocabularies.
tombaker commented 2 years ago

@sruehle @kcoyle Many thanks for the thoughtful reviews. We are of course not yet voting on this; rather, we are deciding whether to take it on, and so far we have come up with good reasons why we would not accept to review it more formally.

Personally, specific merits of the proposal aside, I share the reluctance to go down this path, in part for the precedent it would set, as we would surely get other proposals for extending the Type vocabulary, and it would take considerable effort for us to do this in a coherent way.

I will leave this issue open for comments and discussion until Thursday, July 28. Unless anyone wants to offer strong arguments in favor of putting the proposal on our agenda, I will by default close the issue at that time.

AdamRountrey commented 2 years ago

I am again posting on behalf of the Audubon Core 3D Imagery and Data Task Group:

Thank you for your thoughtful replies. We consider Digital3dResource to be a fundamental type of the same rank as Image, Sound, or Software. The fact that instances of this type are digital should not require them to be characterized as datasets, just as digital images, digital sound recordings, and software are not considered datasets. Calling an artist’s digital 3D sculpture a dataset conflates the concept of format with that of type, and it would be an inaccurate representation of the resource.

The criticism that a dataset used to generate a 3D resource might not appropriately fit in this type is valid. We propose removing that particular subcategory and using the following definition and comment:

Definition: A binary file or collection of files primarily intended to hold information about the three-dimensional geometry (surface or volume) of a real or non-real object, set of objects, or scene. Comment: Such files can be used by software to digitally render views of the subject, make measurements, conduct analyses, and create physical 3D replicas. For the avoidance of doubt, 2D renderings (views) produced from a Digital3DResource should not be included in this class.

Again, we do not feel that the proposed term is any more specific than image or sound. The creation and curation of 3D resources is occurring across nearly all academic disciplines, industries, and interests. 3D resources are ubiquitous and require proper identification in general-purpose repositories, which often restrict characterization to the DCMI Type Vocabulary. Furthermore, the creation of many terms similar to Digital3dResource in many different, discipline-specific metadata schemata, unnecessarily complicates the landscape, decreases interdisciplinary accessibility, and impedes long-term preservation. The broad usage of this type of resource is, in fact, the reason our group approached Dublin Core with the initial proposal. Thanks again for your consideration.

tombaker commented 2 years ago

@AdamRountrey Thank you for the clarifications to the definition.

Definition aside, the key issue here is whether the Usage Board wants to expand the Type Vocabulary and, if so, whether it wants to review individual proposals submitted from the community. The Usage Board has approved one-off proposals in the past only to find that this attracted yet more individual proposals - sometimes with the unrealistic expectation that a proposal should be accepted absent good reasons NOT to accept.

Reactions so far in this case, on-list and off, have been along the lines of @kcoyle and @sruehle - ie, for various reasons, to avoid further expansion of the Type Vocabulary, independently of the merits of the proposal at hand. (I could add that the Usage Board has a backlog of other pressing issues to triage, so we must think carefully before asking everyone's attention for a formal review.)

While it is settled that we will not formally review the proposal for inclusion in the Type Vocabulary, I will leave the issue open in case anyone has further comments or feedback on the substance of the revised proposal.

HughP commented 2 years ago

@AdamRountrey Would the use of <format> and a MIME type clarify a sub-class of dataset for your applications?

tombaker commented 1 year ago

This issue has been resolved.