sensepost / snoopy-ng

Snoopy v2.0 - modular digital terrestrial tracking framework
Other
431 stars 127 forks source link

Wigle lookups exception #48

Open maximcherny opened 9 years ago

maximcherny commented 9 years ago

Periodically, I get:

Exception in thread wigle:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/opt/snoopy-ng/plugins/wigle.py", line 112, in run
    if 'shun' in locations['error']:
TypeError: string indices must be integers

I suspect this happens when you eventually go over the daily query limit, but haven't been able to confirm. Have you seen this before?

maximcherny commented 9 years ago

Also, there is an occasional:

/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py:206: Warning: Out of range value for column 'last_update' at row 1
  r = r + self.execute(query, a)
maximcherny commented 9 years ago

In addition, it may be worthwhile considering switching to the JSON-based API (the one introduced with the Wigle.net site refresh circa November 2014) as the now legacy one has a somewhat unobtrusive "NOTE: this version of the site is slated for deactivation!" warning at the very top of the page.

I have got a working prototype if you are interested.

maximcherny commented 9 years ago

Going back to my original comment, the issue is caused by the fact that the string "error" can actually be a valid portion of the returned HTML. For example, looking up SSID "infinity":

<tr class="search">
    <td><a href="/gps/gps/Map/onlinemap2/?maplat=37.68862915&maplon=-97.32712555&mapzoom=17&ssid=infinity&netid=00:0a:95:f3:23:4f">Get Map</a></td>
    <td>00:0a:95:f3:23:4f</td>
    <td>infinity</td>
    <td>&nbsp;</td>
    <td>error</td>
    <td>BSS</td>    
    <td>?</td>
    <td>?</td>
    <td>0000-00-00 00:00:00</td>
    <td>2008-10-24 03:10:00</td>
    <td>0011</td>
    <td>Y</td>
    <td>37.68862915</td>
    <td>-97.32712555</td>
    <td>20081024031000</td>
    <td>0</td>
    <td>0</td>
    <td>7</td>
    <td>N</td>
</tr>

Somewhat unexpected but still possible.

glennzw commented 9 years ago

Well found on that bug, there are a few SSIDs that mess things up (e.g. an SSID with the word "error" in it), as well as account shun.

Yes it'd be great to have a look at your new API.

maximcherny commented 9 years ago

Below is my code, but luckily someone already implemented a Python-based API client and made it available via the cheese shop - https://github.com/viraptor/wigle/tree/master/wigle | https://pypi.python.org/pypi/wigle/0.0.4. I haven't played with it yet but it looks like it ticks all the boxes including the built-in pagination handling.

import datetime
import requests
import json

class InvalidCredentials(Exception):
    pass

class InvalidQueryParams(Exception):
    pass

class WigleApi():
    api_root       = 'https://wigle.net/api/v1/'
    login_endpoint = 'jsonLogin'
    user_endpoint  = 'jsonUser'
    query_endpoint = 'jsonSearch'
    query_args     = {'Query': 'Query'}
    user_agent     = 'Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)'
    timeout        = 5

    def __init__(self, username, password):
        self.username      = username
        self.password      = password
        self.user_info     = None
        self.last_params   = None
        self.last_result   = None
        self.session       = requests.Session()

        self.init_session()

    def init_session(self):
        r = self.session.post(self.api_root + self.login_endpoint, data={
                              'credential_0': self.username,
                              'credential_1': self.password},
                              headers=self.get_headers(), timeout=self.timeout)
        data = json.loads(r.text)
        if data['success']:
            del data['success']
            self.user_info = data
        else:
            raise InvalidCredentials

    def get_headers(self):
        return {'User-Agent': self.user_agent}

    def get_user_info(self):
        return self.user_info

    def query(self, **kwargs):
        self.last_result = None

        params = {
            'addresscode': '',
            'statecode'  : '',
            'zipcode'    : '',
            'variance'   : 0.01,
            'latrange1'  : '',
            'latrange2'  : '',
            'longrange1' : '',
            'longrange2' : '',
            'lastupdt'   : '',
            'netid'      : '',
            'ssid'       : '',
            'freenet'    : False,
            'paynet'     : False,
            'onlymine'   : False,
            'Query'      : 'Query'
        }

        if set(kwargs.keys()) - set(params.keys()):
            raise InvalidQueryParams

        params.update(kwargs)

        for key in ['freenet', 'paynet', 'onlymine']:
            if not params[key]:
                del params[key]

        if isinstance(params['lastupdt'], datetime.datetime):
            params['lastupdt'] = params['lastupdt'].strftime('%Y%m%d%H%M%S')

        self.last_params = params

        r = self.session.post(url=self.api_root + self.query_endpoint,
                              data=params, headers=self.get_headers(),
                              timeout=self.timeout)

        return self.process_query_response(r)

    def has_next(self):
        return self.last_result and self.last_result['count'] == 100

    def get_next(self):
        if not self.has_next() or not self.last_params:
            return None

        params = self.last_params
        params['first'] = self.last_result['last'] + 1
        params['last'] = self.last_result['last'] + self.last_result['count']

        self.last_params = params

        r = self.session.get(url=self.api_root + self.query_endpoint,
                              data=params, headers=self.get_headers(),
                              timeout=self.timeout)

        return self.process_query_response(r)

    def process_query_response(self, r):
        data = json.loads(r.text)

        if not ['success']:
            print data['message']
            return []

        self.last_result = {
            'count' : data['resultCount'],
            'first' : data['first'],
            'last'  : data['last']
        }

        return data['results']