innovation-growth-lab / alphafold-impact

0 stars 0 forks source link

Define methodology to collect AF relevant patents #19

Closed georgerichardson closed 5 months ago

ampudia19 commented 5 months ago

The following are parts of the project for which we plan to use patents:

The reason why this is crucial is that for most analysis our starting point are scholarly articles, so unless we are willing to collect all patents first, and M to M match them to articles, we need to rely on The Lens.

Alternatives to The Lens

ampudia19 commented 5 months ago

@georgerichardson can I get your quick input as to whether the recommended approach above is a sufficiently solid idea? I can write the API request pipelines for recommendation 1. latter this week. I have already downloaded papers that cite AF, papers that mention AF, and papers that cite a paper that cites AF. The focus of the work here would be to define patents that are relevant in so far as being counterfactuals / related to other aspects of biology. Thoughts?

georgerichardson commented 5 months ago

Thanks for writing this up. I certainly can't think of a better approach. A couple of thoughts I do have are:

  1. Can the approach be tested with a very small subset (e.g. by choosing a tiny subfield much smaller than biology)
  2. Similarly, what's the smallest amount of data we might need to do the comparison with the Marx data?

Consider both of those within the timeframe available though. If the window we have is small, then we probably are best off just going straight for the biggest collection possible.

A poorer alternative could be to use one of the other databases you have mentioned, and simply try to identify only direct citations of the AF paper. This would obviously limit our analysis or opportunities for modelling significantly. But perhaps still useful depending on success with The Lens and the results from the analysis prioritisation exercise.