Netflix-Skunkworks / Scumblr

Web framework that allows performing periodic syncs of data sources and performing analysis on the identified results
Apache License 2.0
2.64k stars 318 forks source link

Google Search - included sites #57

Closed trishana closed 9 years ago

trishana commented 9 years ago

Hi everyone

I'm not sure if I configured Google Search properly and I need your help. I want to look for specific queries on Google using only inluded sites.

I have my API key and Custom Search Engine set up. With "Search the entire web but emphasize included sites" option selected in Custom Search Engine I get around as many results as I set in "Max results" (in Scumblr). However, for “Search only included sites” (I put 10 sites to search) I get only 10 results all the time. I read Issue #19 ( https://github.com/Netflix/Scumblr/issues/19 ), I have my Scumblr updated and still 10 results for 'included sites'.

I'd be grateful for your help, Cheers ;-)

ahoernecke commented 9 years ago

Hi @trishana,

I want to make sure I'm understanding correctly. If you use a custom search engine (cx) configured for "Search the entire web" you get greater than 10 results, but if you use a cx configured for "search only included sites") you only get 10 (or fewer)?

Assuming you have specified the max results as 100 in both searches, I'm not sure what would make this not work for "search only included sites" cxs. The logic should be identical. Not knowing what you're searching for and on what sites, is it possible that that there are only ~10 results for that specific search?

Also, if some of the results are identified by the "search the entire web" search but also included in your site specific search, they won't be pulled in a second time (although the result(s) should become associated with the second search as well).

If you're still having issues would you mind pinging me in our gitter channel (https://gitter.im/Netflix/Scumblr)? I've found it a little easier to have these types of conversations through that interface.

trishana commented 9 years ago

ok, it is solved ! I don't know why, but everything works since I add credential again and created a new API key. For the new one it works fine - there's no limit for Google search (both "Search the entire web" and "Search only included sites"). Thank you @ahoernecke , cheers! ;)

ahoernecke commented 9 years ago

No problem, glad you got it working!