allenai / s2-folks

Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.
Other
144 stars 25 forks source link

Q: Inconsistent Search Results Count in Semantic Scholar API #112

Closed wyplogin closed 9 months ago

wyplogin commented 1 year ago

Dear Semantic Scholar API Support Team,

I hope this email finds you well. I have been using the Semantic Scholar API to search for papers related to the keyword "myxococcus". During my search, I encountered an issue with the number of results returned by the API.

According to the API response, there are 5132 relevant papers found for my query. However, after fetching the first 1000 results (10 requests with a limit of 100 results per request), the total number of relevant papers suddenly decreased to 2691 in the 11th request.

I am concerned that there might be a limitation on the number of search results that the API returns, which is causing this discrepancy. I would appreciate it if you could provide some guidance on the following questions:

Is there a limitation on the number of search results that the API returns for a specific query? If so, what is the reason behind this limitation, and is there a way to increase the limit or access the complete set of search results? Are there any best practices or alternative methods that I should follow to ensure I can retrieve all the relevant papers for my query? Your assistance in this matter would be greatly appreciated, as I would like to make the most out of the Semantic Scholar API for my research.

Thank you for your time and support.

Best regards,

Yipeng W

related code:

    with session.get('https://api.semanticscholar.org/graph/v1/paper/search',
                     params=params,
                     headers=headers) as response:
        response.raise_for_status()
        data = response.json()
    # Save data to a JSON file
    with open(f'data_{request_number}.json', 'w') as json_file:
        json.dump(data, json_file)

in data_10.json: {"total": 5132, "offset": 900, "next": 1000, "data" in data_11.json: {"total": 2691, "offset": 1000, "next": 1100, "data"

cfiorelli commented 9 months ago

@wyplogin Could you check if you're still seeing this issue? It might be resolved or I'm failing to repro it correctly. Thank you!

cfiorelli commented 9 months ago

closing in lieu of requestor update - was unable to repro