subugoe / metacheck

Automatically check metadata compliance for hybrid open access (OA).
http://subugoe.github.io/metacheck/
GNU Affero General Public License v3.0
8 stars 1 forks source link

Feature request: Article types #70

Closed zuphilip closed 3 years ago

zuphilip commented 3 years ago

It would be helpful to have also the fine-grained article types information, e.g. research article vs. book review. This important because the latter would not be eligable under DEAL. However, it might be that this information is not part of the Crossref data but can only been found on the publisher's side. Thus, I don't if you can extract it from there. Maybe just for some large publisher like the ones from the DEAL would already be helpful.

Here is an example search on Wiley showing different article types:

grafik

maxheld83 commented 3 years ago

@zuphilip here's some developments tangentially relevant for this:

For now, as per #180, we only accept type="journal-article" for metacheck. This limitation was always tacitly in place, and will become documented once #168 goes live.

This means that going forward, you will receive a report telling you which DOIs were not articles, and therefore omitted. Something like this (see the last line) (currently in german only):

  • Davon erfüllen 290 (100%) das Kriterium not_missing (0 ausgeschlossen)
  • Davon erfüllen 288 (99%) das Kriterium unique (2 ausgeschlossen)
  • Davon erfüllen 288 (100%) das Kriterium within_limits (0 ausgeschlossen)
  • Davon erfüllen 285 (99%) das Kriterium doi_org_found (3 ausgeschlossen)
  • Davon erfüllen 285 (100%) das Kriterium resolvable (0 ausgeschlossen)
  • Davon erfüllen 285 (100%) das Kriterium from_cr (0 ausgeschlossen)
  • Davon erfüllen 285 (100%) das Kriterium from_cr_cr (0 ausgeschlossen)
  • Davon erfüllen 285 (100%) das Kriterium cr_md (0 ausgeschlossen)
  • Davon erfüllen 284 (100%) das Kriterium article (1 ausgeschlossen)

This doesn't yet, per se, address the core of your request, namely sub-type information. More on that in the next comment (trying to keep things organised ☺️).

maxheld83 commented 3 years ago

As you perhaps suspected, the crossref type field, and therefore our above report does not represent the your DOIs in sufficient granularity:

x <- c("10.1111/soc4.12784", "10.1111/spol.12496", "10.1111/spol.12593")
rcrossref::cr_works(x)[["data"]][["type"]]
#> [1] "journal-article" "journal-article" "journal-article"

My hunch would be this is because the proper crossref peer review submission process wasn't used for the DOI in question. If it had been used, the type should have been "peer-review" indeed.

So there is, in fact, an interesting inconsistency here.

Our problem for metacheck is that we'd have to have some structured data source that tells us that x[3] is, in fact, a "Book Review". AFAIK, there is no such data source in crossref:

x <- c("10.1111/soc4.12784", "10.1111/spol.12496", "10.1111/spol.12593")
res <- rcrossref::cr_works(x[3])[["data"]]
str(res)
#> tibble [1 × 33] (S3: tbl_df/tbl/data.frame)
#>  $ alternative.id        : chr "10.1111/spol.12593"
#>  $ archive               : chr "Portico"
#>  $ container.title       : chr "Social Policy & Administration"
#>  $ created               : chr "2020-03-18"
#>  $ deposited             : chr "2020-05-10"
#>  $ published.print       : chr "2020-05"
#>  $ published.online      : chr "2020-03-17"
#>  $ doi                   : chr "10.1111/spol.12593"
#>  $ indexed               : chr "2020-05-11"
#>  $ issn                  : chr "0144-5596,1467-9515"
#>  $ issue                 : chr "3"
#>  $ issued                : chr "2020-03-17"
#>  $ member                : chr "311"
#>  $ page                  : chr "526-527"
#>  $ prefix                : chr "10.1111"
#>  $ publisher             : chr "Wiley"
#>  $ score                 : chr "1"
#>  $ source                : chr "Crossref"
#>  $ reference.count       : chr "0"
#>  $ references.count      : chr "0"
#>  $ is.referenced.by.count: chr "0"
#>  $ subject               : chr "Development,Sociology and Political Science,Public Administration"
#>  $ title                 : chr "Youth, diversity and employment: Comparative perspectives on labour market policies, Rune Halvorsen and Bjørn H"| __truncated__
#>  $ type                  : chr "journal-article"
#>  $ update.policy         : chr "http://dx.doi.org/10.1002/crossmark_policy"
#>  $ url                   : chr "http://dx.doi.org/10.1111/spol.12593"
#>  $ volume                : chr "54"
#>  $ language              : chr "en"
#>  $ short.container.title : chr "Soc Policy Adm"
#>  $ assertion             :List of 1
#>   ..$ : tibble [1 × 6] (S3: tbl_df/tbl/data.frame)
#>   .. ..$ value      : chr "2020-03-17"
#>   .. ..$ order      : int 2
#>   .. ..$ name       : chr "published"
#>   .. ..$ label      : chr "Published"
#>   .. ..$ group.name : chr "publication_history"
#>   .. ..$ group.label: chr "Publication History"
#>  $ author                :List of 1
#>   ..$ : tibble [1 × 4] (S3: tbl_df/tbl/data.frame)
#>   .. ..$ given           : chr "Jennifer"
#>   .. ..$ family          : chr "Shore"
#>   .. ..$ sequence        : chr "first"
#>   .. ..$ affiliation.name: chr "University of Mannheim Mannheim Germany"
#>  $ link                  :List of 1
#>   ..$ : tibble [4 × 4] (S3: tbl_df/tbl/data.frame)
#>   .. ..$ URL                 : chr [1:4] "https://api.wiley.com/onlinelibrary/tdm/v1/articles/10.1111%2Fspol.12593" "https://onlinelibrary.wiley.com/doi/pdf/10.1111/spol.12593" "https://onlinelibrary.wiley.com/doi/full-xml/10.1111/spol.12593" "https://onlinelibrary.wiley.com/doi/pdf/10.1111/spol.12593"
#>   .. ..$ content.type        : chr [1:4] "application/pdf" "application/pdf" "application/xml" "unspecified"
#>   .. ..$ content.version     : chr [1:4] "vor" "vor" "vor" "vor"
#>   .. ..$ intended.application: chr [1:4] "text-mining" "text-mining" "text-mining" "similarity-checking"
#>  $ license               :List of 1
#>   ..$ : tibble [2 × 4] (S3: tbl_df/tbl/data.frame)
#>   .. ..$ date           : chr [1:2] "2020-03-17" "2020-03-17"
#>   .. ..$ URL            : chr [1:2] "http://onlinelibrary.wiley.com/termsAndConditions#vor" "http://doi.wiley.com/10.1002/tdm_license_1.1"
#>   .. ..$ delay.in.days  : int [1:2] 0 0
#>   .. ..$ content.version: chr [1:2] "vor" "tdm"

So, unless/until we find such a structured data source against which we could compare the crossref type, I think this is out of scope for us.

Feel free to re-open.