allenai / s2-folks

Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.
Other
144 stars 25 forks source link

Q: I want to do a multiple term search via the API and get only the papers that are realted to the compound term as a whole, how can I do it? #127

Closed joseph-sclar closed 9 months ago

joseph-sclar commented 11 months ago

Say, for example, I search for "quantum machine learning" (inside " ") in the webapp I only get ~3,500 results. If I use the API and pass query="quantum machine learning" i get over 600,000 results. It returns all the resultas that contain: quantu, machine and learning, as separate words, similar to the results of searching for: quantum machine learning (no " ") in the webapp.

How can I do the query in the API so I only get the ~3,500 results that conrrespond to the whole compound term as a whole?

I have tried with different quotes combinations, and nothing has work so far (""", "', '", etc)

Thanks!

cfiorelli commented 10 months ago

@joseph-sclar Thank you for your question! I believe the necessary action may be to check that the url is encoding properly. I have tested the following code and the results seem appropriate narrowed to the expectation. Let us know how this works for you?

import requests
from urllib.parse import quote

# Define the base URL and API key
BASE_URL = "https://api.semanticscholar.org/graph/v1/paper/search"
S2_API_KEY = "YOUR API KEY HERE"

# Encode the query
query = quote('"quantum machine learning"')

# Make the API request
response = requests.get(
    f"{BASE_URL}?query={query}&limit=100",
    headers={"x-api-key": S2_API_KEY}
)

# Print results
data = response.json()
print(data)