cambialens / lens-api-doc

10 stars 5 forks source link

Patent API: Mismatch between results from the API and the website #35

Closed m0bi5 closed 3 years ago

m0bi5 commented 3 years ago

I am currently using the Patent API to search for patents with applicant.name = "Virginia Polytechnic Institute And State Univ". The equivalent search on the website returns 41 results (Link to search), however, the API returns atleast 100 results.

request = {
    #Terms to search for
    'query':{
        'match': {
            'applicant.name': 'Virginia Polytechnic Institute And State Univ'
        }
    },
    'size': 100
}
headers = {
    'Authorization': f'Bearer {key}', 
    'Content-Type': 'application/json'
}
response = requests.post(
    url='https://api.lens.org/patent/search',
    headers=headers,
    json=request
)
print(len(response.json()['result'])) #Prints 100, but 41 is expected

I have also tried applicant.name.keyword='Virginia Polytechnic Institute And State Univ' but that returns 0 results.

rosharma9 commented 3 years ago

@m0bi5 you can use match phrase query to search the exact phrase.

{
    "query": {
        "match_phrase": {
            "applicant.name": "Virginia Polytechnic Institute And State Univ"
        }
    }
}
m0bi5 commented 3 years ago

@rosharma9 Thanks, that seems to be better. However, there is still a mismatch of around 5-10 records compared to that obtained from the website. Any idea why that may be the case?

rosharma9 commented 3 years ago

@m0bi5 The link you shared is giving same results (46) as API.

But for some cases, there might be some difference in the results. Please refer to this reply for details.

m0bi5 commented 3 years ago

Okay, thanks. The linked issue was helpful!