nestauk / dap_aria_mapping

Mapping technology innovation to support The Advanced Research and Innovation Agency (ARIA)
MIT License
1 stars 0 forks source link

Investigate if we can collect patent citations #39

Open india-kerle opened 1 year ago

india-kerle commented 1 year ago

to create CD index

georgerichardson commented 1 year ago
image

Hopefully this makes sense. What we have collected already are the 'focal patents'. We need the IDs of the patents that our focal patents cite. We also need the IDs of the patents which cite our focal patents and their references.

To be consistent with OpenAlex:

  1. Back citations: Collect citation data for the focal patents and save as a json like
{
    focal_patent_id_0: [cited_patent_id_0, cited_patent_id_1, ...],
    focal_patent_id_0: [cited_patent_id_1, cited_patent_id_6, ...],
}
  1. Forward citations: For all of the focal patents, find the patents that cite them. For each of these patents, collect their citation data in the same format above. Save separately as forward citations. This might yield a lot of patents. For OpenAlex I had to save the outputs by year.
india-kerle commented 1 year ago

notes from chat w/ george to threshold citations:

Threshold: