data2health / contributor-attribution-model

A simple data model to represent contributions made by agents to research artifacts
3 stars 0 forks source link

'Artifact' scope questions #12

Open mbrush opened 5 years ago

mbrush commented 5 years ago

In the Draft Spec document, Anne raised some interesting questions about how the model might be used to describe things like archeological artifacts, fossils, and biological specimens. Many such entities are naturally occurring, but altered by agents as they become research specimens. As such, attributes such as dateCreated may need additional qualification if they are to apply to these things.

We should consider if these types of entities are in scope, and if so how to adjust the model/documentation accordingly.

And on a related note, should we remove any constraints that the model is meant to cover only research/scholarly artifacts? I think this was the original scope, but it may be necessarily limiting

diatomsRcool commented 5 years ago

I definitely think we need to constrain ourselves to objects (physical or digital) that are involved in research or scholarly activity. Can we remove the "intent" from the creation of the object? Objects can be created for a variety of reasons outside of research and then become part of the research process. We don't care why someone created it. It just has to be used in research or in a scholarly activity. Then its in scope. Does that make sense?

mbrush commented 5 years ago

I agree here and would propose removing the part of the definition suggesting 'creation for a particular use, such that the definition of Artifact simply reads: "a physical or digital entity created by an agent".

But this still leaves the question of things like biological specimens - these were not created originally by an Agent, but their collection and tracking as a specimen was done my a human.

If we are to include such objects in scope, we should provide language in the definition and/or documentation to be clear here. e.g. that collection of a specimen such as a leaf constitutes creation of a new artifact - in that it becomes stored and described as part of some collection, and gains new provenance and uses in research activities.

ahwagner commented 5 years ago

Maybe "a physical or digital entity created or catalogued by an agent"?

mbrush commented 5 years ago

Update: I modified the definition and description text based on feedback above.

The def now reads:

"A physical or digital entity created, collected, modified, or cataloged by an agent."

The description includes the following text:

"Artifacts are the products of agent-driven activities, and represent things to which Contributions are made. Here we are primarily concerned with artifacts created or used in research and scholarly activities. This may include ‘natural’ specimens (e.g. a dinosaur fossil, an arctic ice core sample) or man-made archaeological artifacts (e.g. prehistoric human tool fragments), that are modified and/or cataloged for research purposes."

Finally, I added text about how to handle creation and modification dates for natural and archaeological artifacts in an "Implementation Notes" subsection below the Artifact IM table and examples. Copied below, but please review and comment in the gdoc.

Many natural or archaeological artifacts originate outside of a research setting, and are only collected and documented as specimens much later (e.g. a dinosaur tooth fossil, or prehistoric tool fragments). Here, dateCreated SHOULD be used to record the date such specimens were taken, not the date originally came into existence (which may have been thousands or millions of years ago). Similarly, dateModified SHOULD record when modifications were last made to the specimen in a research/academic context (e.g. samples extracted for analysis, etc.) A related issue to consider is the fact that the CDM can describe contributions made to physical specimens and to catalog records describing such specimens. We leave it to implementations to decide what type of artifact they want to track, and do this in a sensible and consistent way. An exception is natural specimens that are observed and catalogued but not physically collected or modified in the process - where implementations SHOULD describe contributions to catalog record.

Is this an improvement that addresses some of the concerns above and commented on in the spec doc? Additional suggested changes?

mbrush commented 5 years ago

On a related note - should we bother providing dateCreated and dateModified attributes at all - given that we aim here to provide only a minimal, generic artifact model. I think it is useful because these are foundational and universal, but are subject to nuanced considerations (see above). Including these in our spec (with the option of ignoring or extending the model in these areas) gives us the chance to highlight these nuances and promote consistent specification of these attributes.

@mellybelly looking for your take in particular.

diatomsRcool commented 5 years ago

This addresses my concerns.