I am trying to pull the results on a search that yields ~23k on the browser, perform cursor based pagination and generate a pandas dataframe. For some reason, my query keeps getting the bad request error. I was wondering if you could help me with that:
url = 'https://api.lens.org/patent/search'
token = my-token
data = '''{
"query":{
"bool":{
"should":[
{"match": {"inventor.name": "roche"}}, {"match":{"owner_all.name": "roche"}}
]
}
}, "size":1000, "scroll":"1m"
}'''
headers = {'Authorization': token, 'Content-Type': 'application/json'}
def scroll(scroll_id):
if scroll_id is not None:
global data
data = '''{"scroll_id": "%s"}''' % scroll_id
response = requests.post(url, data=data, headers=headers)
if response.status_code !=requests.codes.ok:
print(response.status_code)
elif response.status_code == requests.codes.too_many_requests:
time.sleep(8)
scroll(scroll_id)
else:
response_json = response.json()
print(response_json["scroll_id"])
scroll_id = response_json["scroll_id"]
df = pd.read_json(response.text)
print(df.shape)
scroll(scroll_id)
scroll(scroll_id=None)`
I believe you should be getting 404 response with body "Parameter 'size' shouldn't be greater than 100."
The current Trial API allows 100 records for request and you are using 1000.
I am trying to pull the results on a search that yields ~23k on the browser, perform cursor based pagination and generate a pandas dataframe. For some reason, my query keeps getting the bad request error. I was wondering if you could help me with that: