clips / pattern

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
https://github.com/clips/pattern/wiki
BSD 3-Clause "New" or "Revised" License
8.72k stars 1.58k forks source link

Yahoo search query creation #21

Closed rosner closed 11 years ago

rosner commented 11 years ago

Hi,

I just tried to use the Yahoo SearchEngine class. I signed up for the Yahoo BOSS API. Everything works fine, but consider this example:

yahoo = Yahoo(license=(KEY, SECRET))
yahoo.search('Yahoo Reuters Jobs')

the query that gets generated looks like this:

Yahoo_Reuters_Jobs

I would expect that the query should look something like this: Yahoo%20Reuters%20Jobs

So I urlquote my queries before passing it to the search method:

from urllib import quote

yahoo.search(quote('Yahoo Reuters Jobs'))

which works as I expected it.

Here's the implementation of the query construction:

url = URL(url, method=GET, query={
                 "q": oauth.normalize(query.replace(" ", "_")),
             "start": 1 + (start-1) * count,
             "count": min(count, type==IMAGE and 35 or 50),
            "format": "json"
        })

Any thoughts on that or am I missing something?

tom-de-smedt commented 11 years ago

When the BOSS service started there were some issues with special characters and the like in query keywords, I think the "_" had something to do with that. However, the right way is "%20" or "+" as you point out. I've updated the latest revision. Thanks for spotting!