medialab / SearchEnginesBookmarklet

Extract list of results from search engines pages as CSV with a bookmarklet directly within the browser
https://medialab.github.io/SearchEnginesBookmarklet/
GNU General Public License v3.0
18 stars 2 forks source link
baidu bing duckduckgo google qwant scraping search-engine

SearchEnginesBookmarklet

Harvesting lists of urls, titles, dates and descriptions from a query on a search engine such as Google, DuckDuckGo, Baidu, Bing or Qwant is a recurrent need in digital methods and a hardly automatable one because of those website's restrictions towards robots.

SearchEnginesBookmarklet is a low tech solution to this need by offering you an easy way to do directly from within your browser.

Install it in a few clicks from the following page: https://medialab.github.io/SearchEnginesBookmarklet/

It works as a small icon to drag and drop into your browser's bookmarks bar, allowing you to:

Install local version for development

# Install node's express dependency
npm install express

# Create an HTTPS key & certificate set
openssl genrsa -out key.pem
openssl req -new -key key.pem -out csr.pem
openssl x509 -req -days 9999 -in csr.pem -signkey key.pem -out cert.pem
rm csr.pem

# Run your local HTTPS server
node serve-https.js

# Edit SearchEnginesBookmarklet.js to comment the second line and uncomment the third one

# Load the following page in your browser to accept the unsafe certificate first
https://localhost:4443/

# Then install your development version of the bookmarklet as usual by dragging and dropping the image from that page into your bookrmarks bar

Credits & License

Benjamin Ooghe-Tabanou, Julien Pontoire & al @ Sciences Po médialab

Discover more of our projects at médialab tools.

SearchEnginesBookmarklet is a free open source software released under GPL 3.0 license.