dcmi / dc-srap

Scholarly Resources Application Profile working group
6 stars 3 forks source link

Publication status #11

Closed juhahakala closed 2 years ago

juhahakala commented 3 years ago

Proposed DCMI Metadata Terms: http://purl.org/dc/terms/status

Label: Status

The condition or state of the described resource.

Note: Typically used with a controlled vocabulary of statuses.

SRAP: The publication state of the described scholarly resource.

In SRAP, the recommended best practice is to use a status from the OpenAIRE vocabulary for publication versions (https://guidelines.openaire.eu/en/latest/literature/field_publicationversion.html):

• info:eu-repo/semantics/draft • info:eu-repo/semantics/submittedVersion • info:eu-repo/semantics/acceptedVersion • info:eu-repo/semantics/publishedVersion • info:eu-repo/semantics/updatedVersion

Subproperty of: Type (http://purl.org/dc/elements/1.1/type) (http://purl.org/dc/terms/type)

Note OpenAIRE uses Type property for these codes. Since they are all related to the publication status of a publication, we considered that there is a need for a Type subproperty Status.

kcoyle commented 3 years ago

FaBiO vocab has a number of status properties: date received, date accepted, date preprint disseminated, date retracted (very important!). But this brings up another issue about status: There is the status of the document being described, but there are disciplines that record in the metadata for a published document the dates of the various "steps", such as submission and acceptance. These are used to establish who published research first. So there's "status of this" and "date of status markers". I think these would need to be separate properties.

See: https://sparontologies.github.io/fabio/current/fabio.html#dataproperties

juhahakala commented 3 years ago

FaBiO conflates "status of this" and "date of status marker". If a resource has submission date, the existence of that date in the metadata record indicates the status of the document at the specified time. This is an attractive solution, because most publication statuses are transitory - code "submitted" would normally be superseded by something else. But "date submitted" will be valid even after the resource has been approved for publication.

We might consider analyzing the status related dates from FaBiO, and adopting them if deemed relevant. Currently we have taken onboard just date retracted as a proposed element.

juhahakala commented 3 years ago

In a WG meeting 2021-09-21 we decided to make the scope of this property significantly narrower. New proposal:

Proposed DCMI Metadata Terms: http://purl.org/dc/terms/publicationStatus

Label: publicationStatus

State of the publication in its publication process.

Note: Typically used with a controlled vocabulary of statuses.

SRAP: The publication status of the described scholarly resource.

There are at least two statuses missing from the OpenAire vocabulary: retracted and lost. The WG will also consider whether COAR version types vocabulary (https://vocabularies.coar-repositories.org/version_types/) were a better choice.

juhahakala commented 3 years ago

OpenAire no longer maintains its vocabulary of publication versions. Instead, it recommends usage of COAR Version types vocabulary (https://vocabularies.coar-repositories.org/version_types/).

From SRAP point of view, this vocabulary covers well the statuses of a living document, but if a publication no longer exists and metadata record acts as a tombstone, there is no way to indicate the reason why the document is not available. The following three new statuses have been proposed to the COAR Editorial Board: replaced, retracted and lost.

kcoyle commented 3 years ago

Strawman for DCT property

Term Name: publicationStatus

URI http://purl.org/dc/terms/publicationStatus
Label Publication Status
Definition The stage of the resource in the publishing workflow
Comment Recommended practice is to use a value from a standard list.
Type of Term Property

Things I am not sure of:

Also, note that schema.org has "creativeWorkStatus" - "The status of a creative work in terms of its stage in a lifecycle. Example terms include Incomplete, Draft, ..." I borrowed "stage" from here because I couldn't think of a good term.

juhahakala commented 3 years ago

Both COAR Version Types vocabulary and schema.org creativeWorkStatus are limited in the sense that they do not cover what happens to a scholarly work after it is no longer available. Since metadata may (and sometimes must, as in the DOI system) persist after the identified resource is no longer available, we need a set of "post mortem" statuses. As a COAR Editorial Board member, I have sent the following list of them to the chairs of the board:

and suggested that the board should discuss this in the meeting. Adding them could be tricky, not least because these are not version types, unlike other values in the vocabulary. We regard them as publication statuses, and as such also these post mortem values would be OK.

Note: The list was updated 2021-10-21, based on the discussion in SRAP WG meeting. Revised became replaced, and the status missing was added.

tombaker commented 2 years ago

@osma You are right (below) - discussion moved to https://github.com/dcmi/dc-srap/issues/20#issuecomment-1075289142

osma commented 2 years ago

@tombaker I think the date properties are more relevant to issue #20 (unless we want to merge these two issues)

juhahakala commented 2 years ago

Publication statuses proposed in the DC-SRAP meeting on 2022-03-22 are the following:

draft preprint postprint versionOfRecord updated

If publication status is provided, one and only one value shall be given (since a resource is not Schrödinger's cat; it cannot be both preprint and postprint in the same time). However, the entire publication history may be stored in the metadata record as dates; even when the publication status is versionOfRecord, metadata can still provide the date when the preprint of the resource was submitted. This is important, because in science, it is sometimes important to be able to prove who was the first to make some results or ideas available to the scientific community.

versionOfRecord is the final, published version of a resource. It may be published in the Web in advance of its formal publication in the printed / electronic serial. Two date subproperties, datePublished, and dateAheadOfPrint, are required to describe these dates. updated version of a resource may require these two date subproperties as well.

Date subproperties dateMissing and dateLost may be relevant to all publication statuses. dateRetracted shall be used only if the Publication status is versionOfRecord or Updated.

kcoyle commented 2 years ago

If versionOfRecord is always the published version, why not call it published? This is the publication status, and that you consider this the version of record is separate from that. I also think that calling it "published" will be immediately understandable, while "version of record" doesn't say "publication status" to me, so it might confuse people. I could imagine someone defining an ArXiv deposited research write-up the version of record for that work. It may never be published, yet it could be considered "official" enough to be read and cited.

osma commented 2 years ago

I have to agree with @kcoyle - the other status names are immediately obvious, but versionOfRecord is not.

juhahakala commented 2 years ago

I used versionOfRecord because the term is precise. Updated version of a resource is also published, and people may think that postprints and preprints are also published when they have been made available in the Web. I agree that VersionOfRecord is not immediately obvious, but it is nevertheless widely used in publishing, and in the application profile we can explain what it refers to. Translations of the term may or may not be immediately obvious; but at least the Finnish term (julkaisuversio) will be easy to understand to Finnish speakers. But if the WG decides that the name must be less cryptic, I prefer publication to published (and update to updated).

juhahakala commented 2 years ago

There has been no further discussion about this issue, so I have decided to follow Karen and Osma's recommendation and propose the following publication statuses and related guidelines:

draft preprint postprint publication update

Publication refers to fully copyedited, typeset and formatted copy of a manuscript as published; also known as version of record (see https://en.wikipedia.org/wiki/Version_of_record).

Update refers to an updated version of a publication. There is no limit on how many times a publication may be updated.

There are no additional publication statuses for publications which are missing or lost. In metadata. such situation is indicated with dateMissing and dateLost.

juhahakala commented 2 years ago

Publication statuses as of 2022-04-05, after active discussion in the WG meeting:

public draft submitted manuscript preprint postprint publication update

Date subproperties related to these statuses:

dateAvailableAsPublicDraft dateReceivedAsManuscript dateSubmittedAsPreprint dateSubmittedAsPostprint dateAccepted dateAheadOfPrint datePublished dateUpdated dateRetracted dateMissing dateLost

See https://github.com/dcmi/dc-srap/blob/main/terms/date_subproperties.md for definitions and usage guidelines.

juhahakala commented 2 years ago

Publication statuses agreed on in the 14th SRAP WG meeting, May 3rd:

public draft submitted manuscript preprint postprint publication updated publication

These publication statuses will be submitted to the Usage Board once the WG has completed the profile.