achillean / shodan-python

The official Python library for Shodan
https://developer.shodan.io
Other
2.51k stars 569 forks source link

Notice: fewer results were saved than requested #145

Closed alketshabani closed 3 years ago

alketshabani commented 4 years ago

Hello,

I have a limit --limit 5000 in my query to download data from Shodan, but i get: Notice: fewer results were saved than requested,

any help would be appreciated

gjfrommemel commented 3 years ago

I believe this error is most likely due to free API key, which limits results to a single page.

achillean commented 3 years ago

Closing as not relevant. Please contact support@shodan.io. This repository is for technical discussions around the Python library - not customer support.

nuubnuub commented 1 year ago

Bringing this up again, as I'm not using the Free API (corporate), and I'm consistently getting the same truncated results.

I'm aware that the search(), and search_cursor() methods silently fail when the API call fails, but it's to a point where I'll query the results for a single host and the count of returned results is 14, but I can only get 5 of them. If it was something like 10,000 total results but only being able to capture 8500 that's a little more understandable but, lately I can even parse 50% of the total results.

API_KEY = os.environ['Shodan_API_Key']
api = Shodan(API_KEY)

query = 'hostname:my.query'
limit = api.count(query=query)['total']

logging.info(f'Total Results to Parse: {limit}')
print(f'Total Results to Parse: {limit}')

counter = 0
info_wanted = []

for banner in api.search_cursor(query, retries=100):  # The retries are arbitrary whether its 5 to 100 it still fails.
    counter += 1
    try:
        info_d = {'ip_str': banner['ip_str'],
                  'country_name': banner['location']['country_name'],
                  'longitude': banner['location']['longitude'],
                  'latitude': banner['location']['latitude'],
                  'hostnames': banner['hostnames']}
        info_wanted.append(info_d)

        logging.info(f'Information Parsed. Parsed: {counter}, Remaining: {limit - counter}')
        print(f'Information Parsed. Parsed: {counter}, Remaining: {limit - counter}')
        # Keep track of how many results have been downloaded so we don't use up all our query credits
        if counter >= limit:
            break
    except KeyError as e:

        logging.info(f'Neutral Result Parsed. Parsed: {counter}, Remaining: {limit - counter}')
        print(f'Neutral Result Parsed. Parsed: {counter}, Remaining: {limit - counter}')
        info_d = {'ip_str': banner['ip_str'],
                  'country_name': banner['location']['country_name'],
                  'longitude': banner['location']['longitude'],
                  'latitude': banner['location']['latitude'],
                  'hostnames': banner['hostnames']}
        info_wanted.append(info_d)
        if counter >= limit:
            break
        continue

The output of the results, showing how poor the search results are:

Screenshot 2023-06-21 at 11 04 36 AM

And for even larger queries:

Screenshot 2023-06-21 at 11 38 20 AM

I don't know if this will get addressed, but any solution would be nice!