The CrossRef API seems to already provide lists of cited works for a scientific article, see e.g. this example search.
It seems much easier to make use of this already existing information before trying to extract data from a PDF, which is a very cumbersome task.
There can still be PDF extraction as the fallback option... However, one would have to assert that PDF scraping yields the correct results.
Proposal
Use the CrossRef API to search for the article. This can be done either by DOI or by author and name.
Then, extract the cited works and their DOIs
Rework the bibtex fields that hold citation information:
Use referenced-dois key for the referenced DOIs
Find another name to use for DOIs that this work is cited by
Continue using the cites and cited-by fields for cite keys, i.e. entries within the library
The CrossRef API seems to already provide lists of cited works for a scientific article, see e.g. this example search.
It seems much easier to make use of this already existing information before trying to extract data from a PDF, which is a very cumbersome task.
There can still be PDF extraction as the fallback option... However, one would have to assert that PDF scraping yields the correct results.
Proposal
referenced-dois
key for the referenced DOIscites
andcited-by
fields for cite keys, i.e. entries within the library