Closed georgerichardson closed 5 months ago
@georgerichardson can I get your quick input as to whether the recommended approach above is a sufficiently solid idea? I can write the API request pipelines for recommendation 1. latter this week. I have already downloaded papers that cite AF, papers that mention AF, and papers that cite a paper that cites AF. The focus of the work here would be to define patents that are relevant in so far as being counterfactuals / related to other aspects of biology. Thoughts?
Thanks for writing this up. I certainly can't think of a better approach. A couple of thoughts I do have are:
Consider both of those within the timeframe available though. If the window we have is small, then we probably are best off just going straight for the biggest collection possible.
A poorer alternative could be to use one of the other databases you have mentioned, and simply try to identify only direct citations of the AF paper. This would obviously limit our analysis or opportunities for modelling significantly. But perhaps still useful depending on success with The Lens and the results from the analysis prioritisation exercise.
The following are parts of the project for which we plan to use patents:
The Lens advantages
Any other databases with data on NPL (non-patent literature) citations have patents indexed, but citations are often not linked to academic papers. For example, in patent
WO2020176389A1
, there is a citation to a book chapter using abbreviations:NAFISSI ET AL., APPL. MICROBIOL. BIOTECH., vol. 98, 2014, pages 2841 - 2851
. If a database identifies this as a NPL citation (for example, Google Patents does, or the EPO), it is not linked to any academic material. Only The Lens has actual matches to the scholarly content.The reason why this is crucial is that for most analysis our starting point are scholarly articles, so unless we are willing to collect all patents first, and M to M match them to articles, we need to rely on The Lens.
Alternatives to The Lens
doi
value, or other external identifiers. GlobalDossier is extremely unreliable, and has blocked me.Notes
Recommendation
Field of Study
subfields relevant to us (ie. Biology) as well as relevant MeSH fields. This collection should focus on any patent application or patent publications in 2023, and target data should be research articles citing these.