commonsearch / cosr-back

Backend of Common Search. Analyses webpages and sends them to the index.
https://about.commonsearch.org
Apache License 2.0
123 stars 24 forks source link

Update to Common Crawl's February 2016 crawl. #24

Closed sylvinus closed 8 years ago

sylvinus commented 8 years ago

Just announced: http://blog.commoncrawl.org/2016/02/february-2016-crawl-archive-now-available/

Should be rather easy to switch in the code: https://github.com/commonsearch/cosr-back/blob/master/scripts/import_commoncrawl.sh#L13