texworld / betterbib

:green_book: Command-line tools for bibliographies.
816 stars 42 forks source link

Generate DOI from arxiv URLs #270

Closed cmungall closed 9 months ago

cmungall commented 11 months ago

It should be trivial to assign a DOI based on arxiv, but it seems betterbib doesn't do this.

Here's an example file:

@unpublished{caufield_structured_2023,
    title = {Structured prompt interrogation and recursive extraction of semantics ({SPIRES}): {A} method for populating knowledge bases using zero-shot learning},
    url = {http://arxiv.org/abs/2304.02711},
    abstract = {Creating knowledge bases and ontologies is a time consuming task that
relies on a manual curation. AI/NLP approaches can assist expert curators
in populating these knowledge bases, but current approaches rely on
extensive training data, and are not able to populate arbitrary complex
nested knowledge schemas. Here we present Structured Prompt Interrogation
and Recursive Extraction of Semantics (SPIRES), a Knowledge Extraction
approach that relies on the ability of Large Language Models (LLMs) to
perform zero-shot learning (ZSL) and general-purpose query answering from
flexible prompts and return information conforming to a specified schema.
Given a detailed, user-defined knowledge schema and an input text, SPIRES
recursively performs prompt interrogation against GPT-3+ to obtain a set
of responses matching the provided schema. SPIRES uses existing ontologies
and vocabularies to provide identifiers for all matched elements. We
present examples of use of SPIRES in different domains, including
extraction of food recipes, multi-species cellular signaling pathways,
disease treatments, multi-step drug mechanisms, and chemical to disease
causation graphs. Current SPIRES accuracy is comparable to the mid-range
of existing Relation Extraction (RE) methods, but has the advantage of
easy customization, flexibility, and, crucially, the ability to perform
new tasks in the absence of any training data. This method supports a
general strategy of leveraging the language interpreting capabilities of
LLMs to assemble knowledge bases, assisting manual knowledge curation and
acquisition while supporting validation with publicly-available databases
and ontologies external to the LLM. SPIRES is available as part of the
open source OntoGPT package: https://github.com/
monarch-initiative/ontogpt.},
    author = {Caufield, J Harry and Hegde, Harshad and Emonet, Vincent and Harris, Nomi L and Joachimiak, Marcin P and Matentzoglu, Nicolas and Kim, Hyeongsik and Moxon, Sierra A T and Reese, Justin T and Haendel, Melissa A and Robinson, Peter N and Mungall, Christopher J},
    month = apr,
    year = {2023},
    note = {ISBN: 2304.02711
Publication Title: arXiv [cs.AI]},
}
nschloe commented 9 months ago

Betterbib 7 can also fetch data from arxiv, so let's consider this fixed.