pkp / pkp-lib

The library used by PKP's applications OJS, OMP and OPS, open source software for scholarly publishing.
https://pkp.sfu.ca
GNU General Public License v3.0
307 stars 448 forks source link

Improve Publication Date #1369

Closed bozana closed 4 years ago

bozana commented 8 years ago

The publication date should be improved in OJS 3.0:

ghost commented 8 years ago

It should be possible to distinguish between publication dates of different versions of the same paper, if needed. E.g. case is a paper which was published in print in the past and republished online later (often on a basis of different terms such as CC licenses). But for citation/indexing purposes (e.g. Google Scholar) the original year should be used. Compare this post: http://forum.pkp.sfu.ca/t/distinguish-online-publication-date/15741/2

bozana commented 8 years ago

@asmecher, would we use the same ONIX list in OJS too? I believe it is not needed, right? If not, what extra dates should we consider, for user to be able to chose from?

asmecher commented 8 years ago

I agree that ONIX isn't a good match -- anywhere we don't have to introduce it, we shouldn't!

I wonder if JATS has a good set of standard dates that we could borrow from. At a glance it has a date element with a free-form date-type field that's less than helpful -- examples include received, etc., but this isn't a controlled vocabulary; there is also a pub-date element, which is more specific. @axfelix, any suggestions here? Is there a de-facto controlled vocab in the JATS tag set for dates?

axfelix commented 8 years ago

Both date and pub-date allow day/month/year/season as you've noticed. Beyond that the most specificity you'll get is the date-type attribute of date and the publication-format attribute of pub-date. Both of the attributes have "suggested usage" which isn't a controlled vocabulary per se but does prescribe some values; this is the typical JATS approach to avoid blocking validation on free-form strings:

http://jats.nlm.nih.gov/archiving/tag-library/1.1/attribute/date-type.html http://jats.nlm.nih.gov/archiving/tag-library/1.1/attribute/publication-format.html

ghost commented 7 years ago

Maybe publication date of the printed version should be optional? As not always there is a need to distinguish between these two types of dates. These two dates should be as well included in Crossref export metadata (if both dates present). As for DOAJ, I am not sure if they suport these two types of dates. I asked them, but I didn't get answer about that.

ghost commented 7 years ago

A few more thoughts – how this is going to be implemented for an upgrade? Which publication date type is going to be default one? Will there be any issue (at least) level option to switch between publication date type?

bozana commented 7 years ago

From JATS approach:

...there are two attributes that can be used to describe history dates and publication dates. The @date-type attribute records the type of event in the lifecycle, such as “accepted”, “published”, or “revised” dates. In addition, the @publication-format attribute records the type of format or media, for example, “print” or “electronic”, to which the event happened.

For @publication-format:

Values may include such formats as “print”, “electronic”, “video”, “audio”, “ebook”, “online-only”.

The @date-type (accepted, corrected, pub, preprint, retracted, received, rev-recd, rev-request) attribute we already save in a different way during the workflow, thus here we are only interested in the @date_type = pub i.e. published date. Thus this attribute we can ignore.

I am not sure we need all these examples for @publication-format attribute. I would say that we only take "print" and "electronic" for now.

Similar to OMP, we could have the "Publication Dates" grid: publicationdatesgrid

And then a select and an input element, something like: publicationdateadd

Then, the question is only how to fill it i.e. shall first the electronic pub date be set to the scheduled date?

I think that the electronic date could be the default and requierd one -- OJS is a software for electronic publishing :-) Thus, when migrating from OJS 2.4.x the current date published would be the electronic one. For now I would not consider any setting option, till we not see that it is very needed. Would this be OK in your scenario @piotreba?

@asmecher, OK so?

THANKS all!!!

ghost commented 7 years ago

Each article should have at least one publication date type, e.g. "online" publication date type as a default, with an option to set second publication date ("print"). This is particularly important for printed back issues, when their original "print" publication year should be also present in OJS (along with a second "online" publication date). OJS is often used for publishing back issues, which were originally published as printed editions. In my case I have issues back to 1923 published in OJS and I wonder what workflow would be required to set two publications date types for the back issues? At least any simple sql directly within database would be desirable? And would it be possible to use only YEAR instead of full date? It is not always possible to identify exact date of publication. And the year could be taken from issue year (which is not used now for publication date, but just for describing issue, is it true?)

bozana commented 7 years ago

Each article should have at least one publication date type, e.g. "online" publication date type as a default, with an option to set second publication date ("print").

Yes, this would be the case with the solution above...

In my case I have issues back to 1923 published in OJS and I wonder what workflow would be required to set two publications date types for the back issues? At least any simple sql directly within database would be desirable?

How are you dealing with that now? -- OJS 2.4.x does not have an option to enter those "print" publication dates. Thus, for the migration from 2.4.x we cannot do anything, because OJS does not have any knowledge about those dates, I think. We will consider both dates in the native import/export XML for 3.x. A manual SQL would work, if you would have for example the same "print" date for all articles of an issue (so you do not have to insert a different value for each article) or any kind of logic (so that it could be done grammatically). Also here, how/where from to know/read those dates?

And would it be possible to use only YEAR instead of full date? It is not always possible to identify exact date of publication. And the year could be taken from issue year (which is not used now for publication date, but just for describing issue, is it true?)

OK, then we would probably have to have the date format, and thus probably the same form as in OMP, i.e. something like:

publicationdateadd-omp

This would allow to use any format, also just a year. I will have to check the consequences for allowing this option in other parts of the system... but I hope it would be manageable. @asmecher, what do you think?

Yes, the issue has a metadata "year", that a user can enter and date_published (the date when it is published in OJS) and that is automatically assigned.

Eventually, these solution and option to allow different "pub" dates could be considered for issues as well....

OK?

stranack commented 7 years ago

This would help me better display the expected publication date (based on print) with this collection of newly digitized but older books (some from the late 70s): http://monographs.lib.sfu.ca/index.php/sfulibrary/catalog/series/sfu-archaeology-press

I'd also agree that just displaying the year would be the best option.

ghost commented 7 years ago

For all back issues that were print I completely removed any dates via direct sql (no issue and article published date). For Crossref, before I removed all dates in database, I changed them also via direct sql so that a year was a year of the volume (with 01-01 standing for month and day). Then in exported xml metadata for Crossref I removed all "01-01" so that only year left and I changed publication type from online to print. Crossref luckily accepts year-only dates, which is in the contrary to DOAJ, who wants whole dates, unfortunately (at least this was true some time ago when I was testing it). After these manipulations I removed all dates assigned to issues/articles (via direct database editing) which were published before OJS was implemented.

So, one case is being able to set valid date type for current items (issues, articles...). Another one is how to deal with back issues/articles. Obviously relying just on article-by-article editing when there are thousands of them is pointless, so maybe at least some sql which could be applied to edit dates and date types would be nice to provide (I imagine that investing your time in development of dedicated interfaces for that may not be an optimal solution).

asmecher commented 7 years ago

Overall, since we're the tool generating e.g. JATS data, not the tool consuming it, we can implement as arbitrary a subset of the possibilities as we like. So far I agree that the publication dates are most important, also having OJS automatically generate a digital publication date, with the option for a print publication date to be entered manually.

@stranack and @piotreba, how big of an issue has it been in practice if we only have e.g. the year of publication for older content? Is data poisoning a practical risk when entering e.g. 1970-01-01 instead of 1970?

I'm asking this because so far we only have a requirement for two dates, and I think the digital publication date can be given to the second in 99% of cases. Supporting an ONIX-like infrastructure with all the date format possibilities it offers seems like overkill when a second datepicker field for print publication date might be enough.

ghost commented 7 years ago

Alec, I would prefer to rely only on a year for print versions of a back issues ("year-01-01" may be confusing especially when there are more than one issues per year). Can "date-time"-type field in sql accept only year? Is it possible to have more flexibility with imputing dates in OJS? E.g. I like that currently a published date field is divided into 3 separate parts: month, day, year. Would it be a problem to allow selecting just a year (for any of date type maybe to simplify developing of the forms)? And then just a year would be present in database (not "year-01-01"). Is it technically problematic (and against metadata standards)? Personally, I will for sure try at first to edit dates where required through database, not via OJS forms, article-by-article, because of a number of articles. But I may imagine that database field types themselves should be coupled with what OJS dashboard allows for best compatibility, so that if I enter just a year for a print date-type of an article published in, e.g., issue 3 of the vol 40 directly via database, and then other editor wants to edit the article via OJS dashboard, s/he might do that without a need to add month and day to the publication date.

asmecher commented 7 years ago

SQL dates don't support varying levels of granularity, though the concept could be supported fairly easily with two fields: a timestamp/date, plus a mask field (e.g. yyyy, yyyymm, quarter, etc). This pattern might be useful enough to apply elsewhere. It's not so different from what ONIX supports but I think the ONIX pattern goes way too far for us (e.g. date spreads, Hijri calendar). I like OAI's granularityType a little better, though we'd need to add some new types.

This would end up with us storing the timestamps in second granularity in the database, presumably with some extra (incorrect) specificity, but chopping the extra off before displaying the data or providing it to anyone else.

ghost commented 7 years ago

Would the mask be flexible enough to accept both full dates (yyyy-mm-dd) and just a year? And OJS forms would allow to leave just a year in a publication-related form? Would it all be compatible with metadata export plugins?

asmecher commented 7 years ago

(Consider surveying JATS4R to see whether it has a good list of dates.)

nils-stefan-weiher commented 6 years ago

I also think ISO8601 Date with a precision mask for year, month, day would be a good fit. XML Schema specifications has some bits about partial dates: https://www.w3.org/TR/xmlschema-2/#gYear

Wikidata goes much beyond this: https://www.wikidata.org/wiki/Help:Dates#Inexact_dates But this is also overkill.

Would be very happy if this will be implemented for OJS 3.2. :smile:

asmecher commented 6 years ago

@isgrim, the big problem with inexact dates is their representation in the database -- they'll need to be represented in a form that'll work with MySQL and PostgreSQL (and soon SQL Server), that'll still support sorting and comparisons. It'll take a little research. Have you worked with representing inexact ISO8601 in the DB?

lilients commented 6 years ago

Hi everybody. Sorry I wasn't following this discussion earlier. I have been working on versioning of published articles in ojs for a while now and for this I moved the publication date from published_submissions to submission_settings (see also the technical doku in the PKP wiki). I now created pull requests to add the changes to the master branch, so one of the wishes should be fullfilled soon:

move them into the submission_settings table

NateWr commented 4 years ago

Closing this issue because it has been superseded by the versioning introduced in 3.2. For a discussion about how to identify different types of version dates, see https://github.com/pkp/pkp-lib/issues/4860.