ualbertalib / HydraNorth

This repo is deprecated. Succeeded by https://github.com/ualbertalib/jupiter. This codebase was a IR built based on Samvera/Sufia
11 stars 4 forks source link

Review the specs from Datacite for DOI, make decisions about metadata requirements #1271

Closed weiweishi closed 7 years ago

weiweishi commented 7 years ago

Metadata requirements:

Sufia 7 We will make #592 a separate discussion to work through the implication of implementation date_published as a required field. We will use missing value for year_created for initial implementation. More thorough thinking is required for Sufia 7 #1272

We will also

johnhuck commented 7 years ago

The missing value code that we recommend using is :unav (see the Processing Instructions column on this sheet in the Data Dictionary). A detailed analysis of the available options leading to this choice is found here.

Another detail in the Processing Instructions column that would be good to double check is that multiple values for the author field are being formatted properly in the API call. I think they must be separated by semi-colons like this: datacite.creator: creator1; creator2; creator3. See the first table under the Metadata Profiles section of the API documentation. It's an instruction for the ERC profile but I think it applies to all profiles that are submitted as inline commands and not XML attachments.

johnhuck commented 7 years ago

Related to the first bullet point is whether the value chosen for publisher should be stored as part of the metadata proper or only supplied in the API calls. If the element is added to the metadata proper (I think it's there in our model but unused and therefore not displayed) and were editable there, I think we would have to accept whatever value was supplied by the user, with a default value of our choosing for when users don't supply a value.

If this would not be a desirable outcome (i.e., if the stakeholders think that ERA is performing the function of publishing the specific content that it holds), it may be better not to store a publisher value with the metadata proper, but only to supply it for the EZID API calls and related things like the OAI. Or at least not make it an editable property.

sfarnel commented 7 years ago

This is certainly worth talking through from the admin side and end user side, where it could well be confusing.

On Thu, Sep 22, 2016 at 3:28 PM, johnhuck notifications@github.com wrote:

Related to the first bullet point is whether the value chosen for publisher should be stored as part of the metadata proper or only supplied in the API calls. If the element is added to the metadata proper (I think it's there in our model but unused and therefore not displayed) and were editable there, I think we would have to accept whatever value was supplied by the user, with a default value of our choosing for when users don't supply a value.

If this would not be a desirable outcome (i.e., if the stakeholders think that ERA is performing the function of publishing the specific content that it holds), it may be better not to store a publisher value with the metadata proper, but only to supply it for the EZID API calls and related things like the OAI. Or at least not make it an editable property.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-249033395, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTOHJ0gsnc89bOk6coO86opPXgQCVks5qsvL4gaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

leahvanderjagt commented 7 years ago

We definitely do not want to be publisher for any of the ERA IR materials. That would be something we do not want to assert and I expect user confusion. We aren't capturing anything like publisher that I can think of for IR. Even the journal articles only have citation, not original publisher. Not sure about other domains like data or digitization, but I assume more complexity there - but if the scope of the current question is only ERA IR, I feel like I can confidently say we shouldn't be calling ourselves publisher in any way in any manifestation w/in ERA-in-HydraDAMS. I've copied Geoff, he can provide thoughts.

Leah

On Fri, Sep 23, 2016 at 12:24 PM, Sharon Farnel notifications@github.com wrote:

This is certainly worth talking through from the admin side and end user side, where it could well be confusing.

On Thu, Sep 22, 2016 at 3:28 PM, johnhuck notifications@github.com wrote:

Related to the first bullet point is whether the value chosen for publisher should be stored as part of the metadata proper or only supplied in the API calls. If the element is added to the metadata proper (I think it's there in our model but unused and therefore not displayed) and were editable there, I think we would have to accept whatever value was supplied by the user, with a default value of our choosing for when users don't supply a value.

If this would not be a desirable outcome (i.e., if the stakeholders think that ERA is performing the function of publishing the specific content that it holds), it may be better not to store a publisher value with the metadata proper, but only to supply it for the EZID API calls and related things like the OAI. Or at least not make it an editable property.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/ 1271#issuecomment-249033395, or mute the thread https://github.com/notifications/unsubscribe-auth/ AEevTOHJ0gsnc89bOk6coO86opPXgQCVks5qsvL4gaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-249266836, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXCnruZLTgT72PTLJK9o_138fEYL-ks5qtBlPgaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

johnhuck commented 7 years ago

@leahvanderjagt et al.: I could have been more clear in emphasizing that all the questions around how to handle the new requirement to supply a value for publisher should rightly be decided by the key stakeholders, and this may take some time for discussion to arrive at a considered solution, because publisher is not a metadata property that we have used up to this point, as you observe, and so the simplest solutions have some drawbacks to be weighed.

DataCite's metadata is, I believe, harvested by aggregators and so it is worth considering that the metadata we register with them will likely be propagated to other places.

For context in these discussions, I recommend consulting DataCite's own documentation, as this is the agency that ultimately issues the DOIs via EZID. In particular: p.10 for an overview of the properties, including mandatory properties; and p.14 for 'guidance on handling missing mandatory property values'.

https://schema.labs.datacite.org/meta/kernel-4.0/doc/DataCite-MetadataKernel_v4.0.pdf

sfarnel commented 7 years ago

Publisher in Datacite is defined somewhat more broadly than we may think of it when we think of 'publishing' in the traditional sense.

"The name of the entity that holds, archives, publishes, prints, distributes, releases, issues, or produces the resource. This property will be used to formulate the citation, so consider the prominence of the role"

Some good examples in Datacite from IRs use a format that may make it clearer that the role of the repository is more archiving or distributing than 'publishing' per se. For example: "Digital Repository at the University of Maryland".

And there is the option, as John notes, of indicating that the value is not included for one reason or another.

Certainly worth discussion to think through pros and cons.

On Fri, Sep 23, 2016 at 4:31 PM, johnhuck notifications@github.com wrote:

@leahvanderjagt https://github.com/leahvanderjagt et al.: I could have been more clear in emphasizing that all the questions around how to handle the new requirement to supply a value for publisher should rightly be decided by the key stakeholders, and this may take some time for discussion to arrive at a considered solution, because publisher is not a metadata property that we have used up to this point, as you observe, and so the simplest solutions have some drawbacks to be weighed.

DataCite's metadata is, I believe, harvested by aggregators and so it is worth considering that the metadata we register with them will likely be propagated to other places.

For context in these discussions, I recommend consulting DataCite's own documentation, as this is the agency that ultimately issues the DOIs via EZID. In particular: p.10 for an overview of the properties, including mandatory properties; and p.14 for 'guidance on handling missing mandatory property values'.

https://schema.labs.datacite.org/meta/kernel-4.0/doc/ DataCite-MetadataKernel_v4.0.pdf

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-249317639, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTMfwXfIW6pMHkaxCx7sg0cTlT0f5ks5qtFMrgaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

leahvanderjagt commented 7 years ago

Thanks for clarifying - looking forward to our mtg on Wednesday to discuss.

On Fri, Sep 23, 2016 at 4:55 PM, Sharon Farnel notifications@github.com wrote:

Publisher in Datacite is defined somewhat more broadly than we may think of it when we think of 'publishing' in the traditional sense.

"The name of the entity that holds, archives, publishes, prints, distributes, releases, issues, or produces the resource. This property will be used to formulate the citation, so consider the prominence of the role"

Some good examples in Datacite from IRs use a format that may make it clearer that the role of the repository is more archiving or distributing than 'publishing' per se. For example: "Digital Repository at the University of Maryland".

And there is the option, as John notes, of indicating that the value is not included for one reason or another.

Certainly worth discussion to think through pros and cons.

On Fri, Sep 23, 2016 at 4:31 PM, johnhuck notifications@github.com wrote:

@leahvanderjagt https://github.com/leahvanderjagt et al.: I could have been more clear in emphasizing that all the questions around how to handle the new requirement to supply a value for publisher should rightly be decided by the key stakeholders, and this may take some time for discussion to arrive at a considered solution, because publisher is not a metadata property that we have used up to this point, as you observe, and so the simplest solutions have some drawbacks to be weighed.

DataCite's metadata is, I believe, harvested by aggregators and so it is worth considering that the metadata we register with them will likely be propagated to other places.

For context in these discussions, I recommend consulting DataCite's own documentation, as this is the agency that ultimately issues the DOIs via EZID. In particular: p.10 for an overview of the properties, including mandatory properties; and p.14 for 'guidance on handling missing mandatory property values'.

https://schema.labs.datacite.org/meta/kernel-4.0/doc/ DataCite-MetadataKernel_v4.0.pdf

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/ 1271#issuecomment-249317639, or mute the thread https://github.com/notifications/unsubscribe-auth/ AEevTMfwXfIW6pMHkaxCx7sg0cTlT0f5ks5qtFMrgaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-249321018, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXLkeyj-oDf8LoIT_0T7HYBsRdPFFks5qtFj5gaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

leahvanderjagt commented 7 years ago

We've decided to use a blanket approach to populate the publisher field for DOI for what's currently in the repository, "University of Alberta Libraries".

As I understand it though, we still want to have a look at this when we enter Peel territory where publisher may be important. Do I have that right?

Support for this reasoning from Joan Starr: That said, I would probably lean toward option #1, although I agree it’s not ideal for those items that may already have a publisher. If you have a way to harvest that information somehow, of course that would be a neat solution. DataCite defines “publisher” in this way: “The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource.” The University of Alberta’s institutional repository certainly meets that definition.

On Tue, Sep 27, 2016 at 10:49 AM, Leah Vanderjagt leahv@ualberta.ca wrote:

Thanks for clarifying - looking forward to our mtg on Wednesday to discuss.

On Fri, Sep 23, 2016 at 4:55 PM, Sharon Farnel notifications@github.com wrote:

Publisher in Datacite is defined somewhat more broadly than we may think of it when we think of 'publishing' in the traditional sense.

"The name of the entity that holds, archives, publishes, prints, distributes, releases, issues, or produces the resource. This property will be used to formulate the citation, so consider the prominence of the role"

Some good examples in Datacite from IRs use a format that may make it clearer that the role of the repository is more archiving or distributing than 'publishing' per se. For example: "Digital Repository at the University of Maryland".

And there is the option, as John notes, of indicating that the value is not included for one reason or another.

Certainly worth discussion to think through pros and cons.

On Fri, Sep 23, 2016 at 4:31 PM, johnhuck notifications@github.com wrote:

@leahvanderjagt https://github.com/leahvanderjagt et al.: I could have been more clear in emphasizing that all the questions around how to handle the new requirement to supply a value for publisher should rightly be decided by the key stakeholders, and this may take some time for discussion to arrive at a considered solution, because publisher is not a metadata property that we have used up to this point, as you observe, and so the simplest solutions have some drawbacks to be weighed.

DataCite's metadata is, I believe, harvested by aggregators and so it is worth considering that the metadata we register with them will likely be propagated to other places.

For context in these discussions, I recommend consulting DataCite's own documentation, as this is the agency that ultimately issues the DOIs via EZID. In particular: p.10 for an overview of the properties, including mandatory properties; and p.14 for 'guidance on handling missing mandatory property values'.

https://schema.labs.datacite.org/meta/kernel-4.0/doc/ DataCite-MetadataKernel_v4.0.pdf

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271# issuecomment-249317639, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTMfwX fIW6pMHkaxCx7sg0cTlT0f5ks5qtFMrgaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-249321018, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXLkeyj-oDf8LoIT_0T7HYBsRdPFFks5qtFj5gaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 <780.492.3851> leahv@ualberta.ca leahv@ualberta.ca


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

sfarnel commented 7 years ago

Yes, agreed; as we are thinking about moving non-IR materials into ERA we will need to examine publisher in the descriptive metadata and how it relates to 'publisher' for purposes of DOI registration.

On Mon, Oct 3, 2016 at 9:39 AM, leahvanderjagt notifications@github.com wrote:

We've decided to use a blanket approach to populate the publisher field for DOI for what's currently in the repository, "University of Alberta Libraries".

As I understand it though, we still want to have a look at this when we enter Peel territory where publisher may be important. Do I have that right?

Support for this reasoning from Joan Starr: That said, I would probably lean toward option #1, although I agree it’s not ideal for those items that may already have a publisher. If you have a way to harvest that information somehow, of course that would be a neat solution. DataCite defines “publisher” in this way: “The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource.” The University of Alberta’s institutional repository certainly meets that definition.

On Tue, Sep 27, 2016 at 10:49 AM, Leah Vanderjagt leahv@ualberta.ca wrote:

Thanks for clarifying - looking forward to our mtg on Wednesday to discuss.

On Fri, Sep 23, 2016 at 4:55 PM, Sharon Farnel <notifications@github.com

wrote:

Publisher in Datacite is defined somewhat more broadly than we may think of it when we think of 'publishing' in the traditional sense.

"The name of the entity that holds, archives, publishes, prints, distributes, releases, issues, or produces the resource. This property will be used to formulate the citation, so consider the prominence of the role"

Some good examples in Datacite from IRs use a format that may make it clearer that the role of the repository is more archiving or distributing than 'publishing' per se. For example: "Digital Repository at the University of Maryland".

And there is the option, as John notes, of indicating that the value is not included for one reason or another.

Certainly worth discussion to think through pros and cons.

On Fri, Sep 23, 2016 at 4:31 PM, johnhuck notifications@github.com wrote:

@leahvanderjagt https://github.com/leahvanderjagt et al.: I could have been more clear in emphasizing that all the questions around how to handle the new requirement to supply a value for publisher should rightly be decided by the key stakeholders, and this may take some time for discussion to arrive at a considered solution, because publisher is not a metadata property that we have used up to this point, as you observe, and so the simplest solutions have some drawbacks to be weighed.

DataCite's metadata is, I believe, harvested by aggregators and so it is worth considering that the metadata we register with them will likely be propagated to other places.

For context in these discussions, I recommend consulting DataCite's own documentation, as this is the agency that ultimately issues the DOIs via EZID. In particular: p.10 for an overview of the properties, including mandatory properties; and p.14 for 'guidance on handling missing mandatory property values'.

https://schema.labs.datacite.org/meta/kernel-4.0/doc/ DataCite-MetadataKernel_v4.0.pdf

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271# issuecomment-249317639, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTMfwX fIW6pMHkaxCx7sg0cTlT0f5ks5qtFMrgaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/ 1271#issuecomment-249321018, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXLkeyj-oDf8LoIT_ 0T7HYBsRdPFFks5qtFj5gaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 <780.492.3851> leahv@ualberta.ca leahv@ualberta.ca


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-251140908, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTKGKaYoJdeDkFK5FC29-e2bfcxUUks5qwSGcgaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

johnhuck commented 7 years ago

@leahvanderjagt @sfbetz A question of clarification for you.

In our meeting a couple of weeks ago with @sfarnel, we discussed the need to ensure that the DOIs we assign through ERA are clearly distinguished from DOIs for previously published versions of the resource.

Right now we use dcterms:isVersionOf to let users provide a "Citation for previous publication", which, naturally, may include a DOI.

Would the stakeholders like to see a separate field for the previous-publication DOI, or is the current arrangement where it can be provided with the citation preferred?

Obviously, the simplest option would be to continue with the current arrangement, but I wanted to make sure I understood the directions you gave us coming out of that meeting.

Thank you!

sfbetz commented 7 years ago

@johnhuck @leahvanderjagt My recommendation would be to continue using the "citation for previous publication" for the previously published DOI. I think this is the best way to make clear the distinction between the DOI for the object in ERA, and the DOI for the published version. Seeing the previously published DOI in context of a citation should help.

sfarnel commented 7 years ago

This seems reasonable to me.

On Thu, Oct 13, 2016 at 10:22 AM, Sonya Betz notifications@github.com wrote:

@johnhuck https://github.com/johnhuck @leahvanderjagt https://github.com/leahvanderjagt My recommendation would be to continue using the "citation for previous publication" for the previously published DOI. I think this is the best way to make clear the distinction between the DOI for the object in ERA, and the DOI for the published version. Seeing the previously published DOI in context of a citation should help.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253563863, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTM03QziUWreMeRNLixfjfWGTXEE_ks5qzlrggaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

leahvanderjagt commented 7 years ago

I agree. An outstanding question for me though is: to what extent have we placed a DOI in the 'link to related item' field? I can't seem to find an example but we record and collect these in ERA mediated deposit. Metadata folks, is there information available from any previous audit that could tell us how many items we have that have a DOI link in the 'link to related item' field?

On Thu, Oct 13, 2016 at 10:53 AM, Sharon Farnel notifications@github.com wrote:

This seems reasonable to me.

On Thu, Oct 13, 2016 at 10:22 AM, Sonya Betz notifications@github.com wrote:

@johnhuck https://github.com/johnhuck @leahvanderjagt https://github.com/leahvanderjagt My recommendation would be to continue using the "citation for previous publication" for the previously published DOI. I think this is the best way to make clear the distinction between the DOI for the object in ERA, and the DOI for the published version. Seeing the previously published DOI in context of a citation should help.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/ 1271#issuecomment-253563863, or mute the thread https://github.com/notifications/unsubscribe-auth/ AEevTM03QziUWreMeRNLixfjfWGTXEE_ks5qzlrggaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253571851, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXDstHvnY4w6sK4fWt28cQ_MUPBlgks5qzmHxgaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

sfarnel commented 7 years ago

@anayram do you have any data on this from past audits? (this is a great example of something that the triplestore will be very helpful for :)

On Thu, Oct 13, 2016 at 10:54 AM, leahvanderjagt notifications@github.com wrote:

I agree. An outstanding question for me though is: to what extent have we placed a DOI in the 'link to related item' field? I can't seem to find an example but we record and collect these in ERA mediated deposit. Metadata folks, is there information available from any previous audit that could tell us how many items we have that have a DOI link in the 'link to related item' field?

On Thu, Oct 13, 2016 at 10:53 AM, Sharon Farnel notifications@github.com wrote:

This seems reasonable to me.

On Thu, Oct 13, 2016 at 10:22 AM, Sonya Betz notifications@github.com wrote:

@johnhuck https://github.com/johnhuck @leahvanderjagt https://github.com/leahvanderjagt My recommendation would be to continue using the "citation for previous publication" for the previously published DOI. I think this is the best way to make clear the distinction between the DOI for the object in ERA, and the DOI for the published version. Seeing the previously published DOI in context of a citation should help.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/ 1271#issuecomment-253563863, or mute the thread https://github.com/notifications/unsubscribe-auth/ AEevTM03QziUWreMeRNLixfjfWGTXEE_ks5qzlrggaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/ 1271#issuecomment-253571851, or mute the thread https://github.com/notifications/unsubscribe-auth/ AFieXDstHvnY4w6sK4fWt28cQ_MUPBlgks5qzmHxgaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253572399, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTBklU2u_LM7Ppb9X5fo-mbpptoeGks5qzmJagaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

anayram commented 7 years ago

@sfarnel @leahvanderjagt We can pull info on how many times we have had DOI links in isVersionOf fields from solr and that will include data up to date. Would that help or do we only need DOIs captured in old ERA? If so I can pull that info from the migration packages.

sfarnel commented 7 years ago

Thanks @anayram Can you pull current data from Solr? Thanks!

On Thu, Oct 13, 2016 at 11:04 AM, Mariana Paredes-Olea < notifications@github.com> wrote:

@sfarnel https://github.com/sfarnel @leahvanderjagt https://github.com/leahvanderjagt We can pull info on how many times we have had DOI links in isVersionOf fields from solr and that will include data up to date. Would that help or do we only need DOIs captured in old ERA? If so I can pull that info from the migration packages.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253574980, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTDBebXJwMGVDE9PgTS4DacXsE2tQks5qzmSPgaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

sfarnel commented 7 years ago

Wait; sorry. What we need is instances of a DOI in dcterms:relation

On Thu, Oct 13, 2016 at 11:05 AM, Sharon Farnel sharon.farnel@ualberta.ca wrote:

Thanks @anayram Can you pull current data from Solr? Thanks!

On Thu, Oct 13, 2016 at 11:04 AM, Mariana Paredes-Olea < notifications@github.com> wrote:

@sfarnel https://github.com/sfarnel @leahvanderjagt https://github.com/leahvanderjagt We can pull info on how many times we have had DOI links in isVersionOf fields from solr and that will include data up to date. Would that help or do we only need DOIs captured in old ERA? If so I can pull that info from the migration packages.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253574980, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTDBebXJwMGVDE9PgTS4DacXsE2tQks5qzmSPgaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

anayram commented 7 years ago

Right, thanks! I'll use relation then

anayram commented 7 years ago

@sfarnel

Sorry! Just checked the data dictionary and it indicates that the field "Citation for previous publication" is mapped to dcterms:isversionof, while "Link to related item" is mapped to dcterms:relation. Should I pull DOI occurrences from both fields?

leahvanderjagt commented 7 years ago

I am not worried about the citation for previous publication field, just the link to related item field - if that helps! :)

On Thu, Oct 13, 2016 at 11:25 AM, Mariana Paredes-Olea < notifications@github.com> wrote:

Or not. Just checked the data dictionary and it indicates that the field "Citation for previous publication" is mapped to dcterms:isversionof, while "Link to related item" is mapped to dcterms:relation. Should I pull DOI occurrences from both fields?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253580244, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXGiQCiyyuVFJFM8G2CkuQJThj0Rhks5qzmlygaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

anayram commented 7 years ago

It does, thanks @leahvanderjagt !

johnhuck commented 7 years ago

@leahvanderjagt @sfbetz , Thanks for your direction on this question. I think it is a good option. Thanks @anayram for checking dct:relation. It will be good to get the full picture. If there are DOIs in that field, though, it's still an appropriate use of the field.

Sharon and I are going to recommend that we use a custom local predicate for the ERA-minted DOIs, like we have done for other identifiers like the ARKs, so whatever the case may be, it shouldn't interfere with our implementation.

johnhuck commented 7 years ago

I am going to edit @Weiwei's first comment to add the information about the decisions that we have made. I think we have covered everything, so the dev team should be able to proceed.

@sfbetz and @leahvanderjagt : We've decided to use "University of Alberta Libraries" as the value for publisher. I am going to assume that the spelled out form that I've used here is the form we would like to use, but if it isn't, please add a comment with the preferred form.

sfarnel commented 7 years ago

Thanks all!

On Thu, Oct 13, 2016 at 11:58 AM, johnhuck notifications@github.com wrote:

I am going to edit @Weiwei https://github.com/Weiwei's first comment to add the information about the decisions that we have made. I think we have covered everything, so the dev team should be able to proceed.

@sfbetz https://github.com/sfbetz and @leahvanderjagt https://github.com/leahvanderjagt : We've decided to use "University of Alberta Libraries" as the value for publisher. I am going to assume that the spelled out form that I've used here is the form we would like to use, but if it isn't, please add a comment with the preferred form.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253589392, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTBsrEjcvmJa6aqjH4sjP1UavW-Qcks5qznE8gaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

anayram commented 7 years ago

228 DOIs are captured in dcterms:relation ("Link to related item" field) in current ERA.

Just for reference in case it is also useful, there are 331 dcterms:isVersionOf fields that contain one or more DOIs ("Citation for previous publication" field).

johnhuck commented 7 years ago

Thanks, @anayram ! Very interesting. Good to know that managing external DOIs in the metadata is a real issue to think about.

Do the 331 dct:isVersionOf elements contain only the DOI or a citation as well (or a mix of both)?

It's perhaps a little beyond the scope of the current task, but we could think about whether there were ways to help guide users when they choose which field to enter an external DOI in, if we want to encourage a more consistent approach.

leahvanderjagt commented 7 years ago

Thank you all very much for such quick work on this.

This raises a usability question for us. I think there is potential for confusion for people trying to cite ERA objects since we will be presenting two DOIs on individual item pages. However, Sonya and I are expecting that our instructions for DOI placement on the item page and accompanying text will help ameliorate that possibility, so I am going to give instruction for no special handling on the metadata side other than to confirm earlier wish to keep DOIs in citation field (do not extract them).

I am torn about what to do with new articles we add in mediated deposit that have existing DOIs (nearly all our articles are deposited by HelpDesk staff). Many publishers require a link to the original article as part of their compliance requirements for deposit. I like to provide a DOI since it is persistent and I feel therefore is cleaner metadata. But this conflicts with instructions to you, so I need to think over future practice and instructions to those who self-deposit, as you point out @johnhuck. I agree those decisions are out of scope for the current question.

On Thu, Oct 13, 2016 at 12:24 PM, johnhuck notifications@github.com wrote:

Thanks, @anayram https://github.com/anayram ! Very interesting. Good to know that managing external DOIs in the metadata is a real issue to think about.

Do the 331 dct:isVersionOf elements contain only the DOI or a citation as well (or a mix of both)?

It's perhaps a little beyond the scope of the current task, but we could think about whether there were ways to help guide users when they choose which field to enter an external DOI in, if we want to encourage a more consistent approach.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253596695, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXLSyqsChMHbdaVaM0hm9bJSu3sLxks5qznd0gaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

anayram commented 7 years ago

@johnhuck all isVersionOf fields contain one or more citations that include one or more DOIs. Example:

Lattice Boltzmann simulations of a single n-butanol drop rising in water Komrakova, A. E. and Eskin, D. and Derksen, J. J., Physics of Fluids (1994-present), 25, 042102 (2013), DOI:http://dx.doi.org/10.1063/1.4800230A.E. Komrakova, Orest Shardt, D. Eskin, J.J. Derksen, Lattice Boltzmann simulations of drop deformation and breakup in shear flow, International Journal of Multiphase Flow, Volume 59, February 2014, Pages 24-43, ISSN 0301-9322, http://dx.doi.org/10.1016/j.ijmultiphaseflow.2013.10.009. (http://www.sciencedirect.com/science/article/pii/S0301932213001547) Keywords: Drop deformation and breakup; Lattice Boltzmann method; Binary liquid model; Peclet and Cahn numbers

leahvanderjagt commented 7 years ago

Just a note that this is very normal/expected for theses, many of which have chapters that were published as article series before they were collated into a thesis/dissertation.

On Thu, Oct 13, 2016 at 12:37 PM, Mariana Paredes-Olea < notifications@github.com> wrote:

@johnhuck https://github.com/johnhuck all isVersionOf fields contain one or more citations that include one or more DOIs. Example:

Lattice Boltzmann simulations of a single n-butanol drop rising in water Komrakova, A. E. and Eskin, D. and Derksen, J. J., Physics of Fluids (1994-present), 25, 042102 (2013), DOI:http://dx.doi.org/10.1063/ 1.4800230A.E. Komrakova, Orest Shardt, D. Eskin, J.J. Derksen, Lattice Boltzmann simulations of drop deformation and breakup in shear flow, International Journal of Multiphase Flow, Volume 59, February 2014, Pages 24-43, ISSN 0301-9322, http://dx.doi.org/10.1016/j.ijmultiphaseflow.2013.10.009. (http://www.sciencedirect.com/science/article/pii/S0301932213001547) Keywords: Drop deformation and breakup; Lattice Boltzmann method; Binary liquid model; Peclet and Cahn numbers

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253600124, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXJl0DPaETBwnCoI__Ox3RKX2YwgTks5qznpfgaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

leahvanderjagt commented 7 years ago

Sorry, didn't reply-all. Just a note that this is very normal/expected for theses, many of which have chapters that were published as article series before they were collated into a thesis/dissertation.

On Thu, Oct 13, 2016 at 12:37 PM, Mariana Paredes-Olea < notifications@github.com> wrote:

@johnhuck https://github.com/johnhuck all isVersionOf fields contain one or more citations that include one or more DOIs. Example:

Lattice Boltzmann simulations of a single n-butanol drop rising in water Komrakova, A. E. and Eskin, D. and Derksen, J. J., Physics of Fluids (1994-present), 25, 042102 (2013), DOI:http://dx.doi.org/10.1063/ 1.4800230A.E. Komrakova, Orest Shardt, D. Eskin, J.J. Derksen, Lattice Boltzmann simulations of drop deformation and breakup in shear flow, International Journal of Multiphase Flow, Volume 59, February 2014, Pages 24-43, ISSN 0301-9322, http://dx.doi.org/10.1016/j.ijmultiphaseflow.2013.10.009. (http://www.sciencedirect.com/science/article/pii/S0301932213001547) Keywords: Drop deformation and breakup; Lattice Boltzmann method; Binary liquid model; Peclet and Cahn numbers

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253600124, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXJl0DPaETBwnCoI__Ox3RKX2YwgTks5qznpfgaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

sfarnel commented 7 years ago

Certainly worth thinking about how to clarify for users. We had talked in past about parsing the citation texts components for a variety of reasons and so this could be incorporated into that discussion. I'm sure there are ways on making the 'link to related item' aspect clearer for users. Another topic for later discussion.

On Thu, Oct 13, 2016 at 12:41 PM, leahvanderjagt notifications@github.com wrote:

Sorry, didn't reply-all.

Just a note that this is very normal/expected for theses, many of which have chapters that were published as article series before they were collated into a thesis/dissertation.

On Thu, Oct 13, 2016 at 12:37 PM, Mariana Paredes-Olea < notifications@github.com> wrote:

@johnhuck https://github.com/johnhuck all isVersionOf fields contain one or more citations that include one or more DOIs. Example:

Lattice Boltzmann simulations of a single n-butanol drop rising in water Komrakova, A. E. and Eskin, D. and Derksen, J. J., Physics of Fluids (1994-present), 25, 042102 (2013), DOI:http://dx.doi.org/10.1063/ 1.4800230A.E. Komrakova, Orest Shardt, D. Eskin, J.J. Derksen, Lattice Boltzmann simulations of drop deformation and breakup in shear flow, International Journal of Multiphase Flow, Volume 59, February 2014, Pages 24-43, ISSN 0301-9322, http://dx.doi.org/10.1016/j.ijmultiphaseflow.2013.10.009. (http://www.sciencedirect.com/science/article/pii/S0301932213001547) Keywords: Drop deformation and breakup; Lattice Boltzmann method; Binary liquid model; Peclet and Cahn numbers

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/ 1271#issuecomment-253600124, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXJl0DPaETBwnCoI__ Ox3RKX2YwgTks5qznpfgaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253601223, or mute the thread https://github.com/notifications/unsubscribe-auth/AEevTOiwHgh1UJIW2jN66tvwX2889WdHks5qzntKgaJpZM4KEQRD .

Sharon Farnel Metadata Coordinator University of Alberta Libraries sharon.farnel@ualberta.ca 780-492-3685

johnhuck commented 7 years ago

@leahvanderjagt , I would be happy to assist as you think about how to handle and display the existing DOIs. It is out of scope, but closely connected to the in-scope task, so it makes sense to at least be thinking it through now.

The earlier discussion on this GitHub issue about making dct:isVersionOf (Citation for previous publication) repeatable is relevant here too, as it would give more flexibility to handle different scenarios.

I can think of a couple of ways to re-use the existing dct:relation and/or dct:isVersionOf fields, possibly with slightly modified labels to achieve the goal of making it easier for users to supply a citation, or citations, or a DOI or all of the above. So we could explore some ideas in that area. Flexibility and clarity for the user seem to me the key factors here.

leahvanderjagt commented 7 years ago

Yes, I agree re: flexibility and clarity for the user. I've asked Sonya a question about this from a UX perspective, we will get back to this conversation once we've had a chance to discuss. Cheers, Leah

On Thu, Oct 13, 2016 at 2:39 PM, johnhuck notifications@github.com wrote:

@leahvanderjagt https://github.com/leahvanderjagt , I would be happy to assist as you think about how to handle and display the existing DOIs. It is out of scope, but closely connected to the in-scope task, so it makes sense to at least be thinking it through now.

The earlier discussion on this GitHub issue https://github.com/ualbertalib/HydraNorth/issues/1037 about making dct:isVersionOf (Citation for previous publication) repeatable is relevant here too, as it would give more flexibility to handle different scenarios.

I can think of a couple of ways to re-use the existing dct:relation and/or dct:isVersionOf fields, possibly with slightly modified labels to achieve the goal of making it easier for users to supply a citation, or citations, or a DOI or all of the above. So we could explore some ideas in that area. Flexibility and clarity for the user seem to me the key factors here.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ualbertalib/HydraNorth/issues/1271#issuecomment-253632003, or mute the thread https://github.com/notifications/unsubscribe-auth/AFieXMFHkYmvyeniiT5dMWQG_Xkw5FYWks5qzpcegaJpZM4KEQRD .


Leah VanderjagtDigital Repository Services Coordinator University of Albertat. 780.492.3851 leahv@ualberta.ca leahv@ualberta.ca

pbinkley commented 7 years ago

@weiweishi I think we should close this, and open an issue for the remaining cleanup work you're doing on items that didn't receive DOIs. Does that make sense? @leahvanderjagt @johnhuck do your comments on what to do with e.g. thesis chapters with pre-existing DOIs need to be captured into a new issue?

johnhuck commented 7 years ago

@pbinkley I had to reread this thread to remind myself what the discussion was about.

To summarize, the main issue was how to manage DOIs for previous published versions: what predicates to use and how to label them appropriately so users wouldn't be confused. We use dct:isVersionOf for citation for previous publication and dct:relation for "link to related item." Mariana found that 228 items had a DOI in a dct:relation field, so we know some external DOIs are recorded here and others as part of the citation for previous publication. I believe Leah agreed with continuing this practice, as long as the UAL DOI was labeled clearly.

I don't know what the situation in HydraNorth2 looks like. I guess I would leave it to Leah to create a new issue if there are concerns about the display.

So I don't think there is anything to carry over from that. The metadata team would probably still like the reports that Weiwei references in the first comment. I don't know if that may already be captured as part of the new cleanup issue. Cheers!