ropensci / patentsview

An R client to the PatentsView API
https://docs.ropensci.org/patentsview
Other
32 stars 9 forks source link

Limit on number of returned patents #25

Closed ppanko closed 2 years ago

ppanko commented 2 years ago

Seems like the maximum number of patents returned is 100k; any way around this for larger queries?

e.g.,

## Query result for page 10 w/ 10k patents per page
pvObj <- search_pv(
  query    = '{"_gte":{"patent_date":"2007-01-01"}}',
  per_page = 10000,
  page     = 10
)

nrow(pvObj$data$patents)
# 10000

## Query result for page 11
pvObj <- search_pv(
  query    = '{"_gte":{"patent_date":"2007-01-01"}}',
  per_page = 10000,
  page     = 11
)

nrow(pvObj$data$patents)
# NULL
crew102 commented 2 years ago

Not that I know of. You probably have two options - first is to split your theoretical results into pieces by looping over dates (e.g., in the example you gave, loop over all months between 2007 and today and have your query get only patents from that month, then bind results together), or just download the raw data from their bulk download page https://patentsview.org/download/data-download-tables.

ppanko commented 2 years ago

Got it -- appreciate the swift response!