Open maximcherny opened 9 years ago
Also, there is an occasional:
/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py:206: Warning: Out of range value for column 'last_update' at row 1
r = r + self.execute(query, a)
In addition, it may be worthwhile considering switching to the JSON-based API (the one introduced with the Wigle.net site refresh circa November 2014) as the now legacy one has a somewhat unobtrusive "NOTE: this version of the site is slated for deactivation!" warning at the very top of the page.
I have got a working prototype if you are interested.
Going back to my original comment, the issue is caused by the fact that the string "error" can actually be a valid portion of the returned HTML. For example, looking up SSID "infinity":
<tr class="search">
<td><a href="/gps/gps/Map/onlinemap2/?maplat=37.68862915&maplon=-97.32712555&mapzoom=17&ssid=infinity&netid=00:0a:95:f3:23:4f">Get Map</a></td>
<td>00:0a:95:f3:23:4f</td>
<td>infinity</td>
<td> </td>
<td>error</td>
<td>BSS</td>
<td>?</td>
<td>?</td>
<td>0000-00-00 00:00:00</td>
<td>2008-10-24 03:10:00</td>
<td>0011</td>
<td>Y</td>
<td>37.68862915</td>
<td>-97.32712555</td>
<td>20081024031000</td>
<td>0</td>
<td>0</td>
<td>7</td>
<td>N</td>
</tr>
Somewhat unexpected but still possible.
Well found on that bug, there are a few SSIDs that mess things up (e.g. an SSID with the word "error" in it), as well as account shun.
Yes it'd be great to have a look at your new API.
Below is my code, but luckily someone already implemented a Python-based API client and made it available via the cheese shop - https://github.com/viraptor/wigle/tree/master/wigle | https://pypi.python.org/pypi/wigle/0.0.4. I haven't played with it yet but it looks like it ticks all the boxes including the built-in pagination handling.
import datetime
import requests
import json
class InvalidCredentials(Exception):
pass
class InvalidQueryParams(Exception):
pass
class WigleApi():
api_root = 'https://wigle.net/api/v1/'
login_endpoint = 'jsonLogin'
user_endpoint = 'jsonUser'
query_endpoint = 'jsonSearch'
query_args = {'Query': 'Query'}
user_agent = 'Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)'
timeout = 5
def __init__(self, username, password):
self.username = username
self.password = password
self.user_info = None
self.last_params = None
self.last_result = None
self.session = requests.Session()
self.init_session()
def init_session(self):
r = self.session.post(self.api_root + self.login_endpoint, data={
'credential_0': self.username,
'credential_1': self.password},
headers=self.get_headers(), timeout=self.timeout)
data = json.loads(r.text)
if data['success']:
del data['success']
self.user_info = data
else:
raise InvalidCredentials
def get_headers(self):
return {'User-Agent': self.user_agent}
def get_user_info(self):
return self.user_info
def query(self, **kwargs):
self.last_result = None
params = {
'addresscode': '',
'statecode' : '',
'zipcode' : '',
'variance' : 0.01,
'latrange1' : '',
'latrange2' : '',
'longrange1' : '',
'longrange2' : '',
'lastupdt' : '',
'netid' : '',
'ssid' : '',
'freenet' : False,
'paynet' : False,
'onlymine' : False,
'Query' : 'Query'
}
if set(kwargs.keys()) - set(params.keys()):
raise InvalidQueryParams
params.update(kwargs)
for key in ['freenet', 'paynet', 'onlymine']:
if not params[key]:
del params[key]
if isinstance(params['lastupdt'], datetime.datetime):
params['lastupdt'] = params['lastupdt'].strftime('%Y%m%d%H%M%S')
self.last_params = params
r = self.session.post(url=self.api_root + self.query_endpoint,
data=params, headers=self.get_headers(),
timeout=self.timeout)
return self.process_query_response(r)
def has_next(self):
return self.last_result and self.last_result['count'] == 100
def get_next(self):
if not self.has_next() or not self.last_params:
return None
params = self.last_params
params['first'] = self.last_result['last'] + 1
params['last'] = self.last_result['last'] + self.last_result['count']
self.last_params = params
r = self.session.get(url=self.api_root + self.query_endpoint,
data=params, headers=self.get_headers(),
timeout=self.timeout)
return self.process_query_response(r)
def process_query_response(self, r):
data = json.loads(r.text)
if not ['success']:
print data['message']
return []
self.last_result = {
'count' : data['resultCount'],
'first' : data['first'],
'last' : data['last']
}
return data['results']
Periodically, I get:
I suspect this happens when you eventually go over the daily query limit, but haven't been able to confirm. Have you seen this before?