Closed apowers313 closed 7 years ago
HI @apowers313
Thanks for the message. Actually it is entirely possible that some BRs do not have any metadata specified – such as those you have mentioned in the comment. This is due to the fact that no metadata are returned by Crossref for a certain bibliographic entry listed in the bibliographic reference list of a certain paper.
For instance, consider paper https://w3id.org/oc/corpus/br/656. It contains several bibliographic entries, including https://w3id.org/oc/corpus/be/260. The content of this entry is:
Purdy S. Avoiding hospital admissions: what does the research evidence say? London: King’s Fund, 2010.
When this particular entry was ingested by the OpenCitations workflow, the call to the Crossref API did not return any sure metadata about the actual bibliographic resource that entry text refers to. Thus, a new bibliographic resource (i.e. https://w3id.org/oc/corpus/br/722) has been created, with no metadata associated. It is worth mentioning that, currently, the OpenCitations ingestion workflow does not perform any NLP extraction mechanism of paper metadata from free text, except the identification of DOIs and URLs by means of regular expressions.
Thus, it is entirely possible that, considering all the citations that are included in the OCC, a part of them points to something that is not resolvable by means of the Crossref API - e.g. when one cites, for instance, webpages.
However, these non-titled resources are also linked to the cited papers by means of the related bibliographic entries. For instance, the SPARQL query I used to get the related bibliographic entries of the resource https://w3id.org/oc/corpus/br/722 is the following one:
PREFIX cito: <http://purl.org/spar/cito/>
PREFIX biro: <http://purl.org/spar/biro/>
PREFIX c4o: <http://purl.org/spar/c4o/>
PREFIX frbr: <http://purl.org/vocab/frbr/core#>
SELECT ?citing ?bib_entry ?bib_entry_text {
<https://w3id.org/oc/corpus/br/722>
^cito:cites ?citing ;
^biro:references ?bib_entry .
?bib_entry
^frbr:part ?citing ;
c4o:hasContent ?bib_entry_text
}
Thus, in principle, it would be possible in the future to use the Crossref API again on all these entries in order to understand if Crossref API will return metadata, due to possible updates of Crossref service.
Thanks! In that case I guess we can close this out.
In running some queries, there are a number of BR entries that have neither a
title
nor anumber
. Some examples aregbr:722
andgbr:644
.If I can count right, there are 1095323 of these records. A full list is here: BR-missing-title.csv.zip