Closed rmovva closed 1 year ago
@rmovva Thanks for reaching out ~! Take a look at the api key header here, let us know if youre still running into trouble.
import requests
import tqdm
# List of arXiv IDs
arxiv_ids = ["arXiv:1703.10593", "arXiv:2001.01489"]
# Define your API key from Semantic Scholar
S2_API_KEY = 'YOUR KEY HERE'
# Splitting the arxiv_ids into chunks of 500 due to the API limitation
def chunks(lst, n):
for i in range(0, len(lst), n):
yield lst[i:i + n]
BASE_URL = 'https://api.semanticscholar.org/graph/v1/paper/batch'
HEADERS = {
'x-api-key': S2_API_KEY,
'Content-Type': 'application/json'
}
# We'll loop through chunks of IDs and send batch requests
for chunk in tqdm.tqdm(list(chunks(arxiv_ids, 500))):
response = requests.post(
BASE_URL,
headers=HEADERS,
params={'fields': 'referenceCount,citationCount,title'},
json={"ids": chunk}
)
if response.status_code == 200:
papers = response.json() # Assuming the response directly gives a list of papers
for paper in papers:
# Do something with each paper's details here
print(paper['title'], paper['referenceCount'])
else:
print(f"Failed to fetch details for chunk: {chunk}")
print("Status Code:", response.status_code)
print("Response:", response.text)
This worked (and ran very quickly), thanks!
Is there documentation somewhere on what attributes can be retrieved using the paper batch API? For example, I notice you have 'referenceCount,citationCount,title' here, but the S2Paper object has many other attributes: https://pys2.readthedocs.io/en/latest/api_reference/models/s2paper.html
However, attributes like citationVelocity don't seem to be available through these batch API calls -- do you know how I can get these other attributes / exactly which ones are available?
Describe the bug I am using s2.api.get_paper to retrieve paper info for ~500K arXiv IDs. I received an API key earlier today, but when I pass in the key as an argument with api_key=S2_API_KEY, I am not able to retrieve papers at my assigned rate limit of 100 requests / second (it seems like I am still at the default public request rate).
To Reproduce e.g.
Expected behavior I expected to retrieve papers at ~100/s, but instead it's more like ~1/s.