Search queries are not URL encoded

alaaalii commented 8 years ago

If I try to run a search against /api/v1/search/pulses with spaces in the query, I get a Bad Request:

>>> otx.search_pulses('phpMyAdmin honeypot logs for 2016-10-01')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "OTXv2.py", line 293, in search_pulses
    return self._get_paginated_resource(search_pulses_url, max_results=max_results)
  File "OTXv2.py", line 317, in _get_paginated_resource
    json_data = self.get(next_page_url)
  File "OTXv2.py", line 79, in get
    raise BadRequest("Bad Request")

OTXv2.BadRequest: 'Bad Request'

This is because the query in the search_pulses method is not URL encoded, so it tries to GET something like this: https://otx.alienvault.com/api/v1/search/pulses?q=phpMyAdmin honeypot logs for 2016-10-01&.

To fix this we can use urllib.quote_plus for Python 2 (or urllib.parse.quote_plus() for Python 3) to properly encode our query.

>>> import urllib
>>> urllib.quote_plus('phpMyAdmin honeypot logs for 2016-10-01')
'phpMyAdmin+honeypot+logs+for+2016-10-01'

So a quick fix would be to update the q=query in the search_pulses method to q=urllib.quote_plus(q), like so:

def search_pulses(self, query, max_results=25):
        search_pulses_url = self.create_url(SEARCH_PULSES, q=urllib.quote_plus(q), page=1, limit=20)
        return self._get_paginated_resource(search_pulses_url, max_results=max_results)

I'm not sure if this is the norm in APIs, but wouldn't it make sense to update your python API module to do the URL encoding? If you're expecting the user to do it, then maybe mention that under the API module documentation?

bsmartt13 commented 8 years ago

Nice find! I'm too lazy to make a pull request right now but quote_plus can be imported cleanly from the correct place depending on python version:

try:
    # For Python2
    from urllib2 import URLError, HTTPError, build_opener, ProxyHandler, urlopen, Request, quote_plus
except ImportError:
    # For Python3
    from urllib.error import URLError, HTTPError
    from urllib.request import build_opener, ProxyHandler, urlopen, Request
    from urllib.parse import quote_plus

And then like @alaaalii said,

def search_pulses(self, query, max_results=25):
        search_pulses_url = self.create_url(SEARCH_PULSES, q=quote_plus(q), page=1, limit=20)
        return self._get_paginated_resource(search_pulses_url, max_results=max_results)

alaaalii commented 8 years ago

Agreed. There are a few things I'm hacking up into this OTX SDK for a project I'm working on (for one thing, I've replaced urllib with requests...much better API), so I'll probably make a pull request by the end of the week (which will include the fix for this URL encoding issue as well). I'll close the issue when I submit the pull request.

AlienVault-OTX / OTX-Python-SDK

Search queries are not URL encoded #18