medialab / sandcrawler

sandcrawler.js - the server-side scraping companion.
http://medialab.github.io/sandcrawler/
GNU Lesser General Public License v3.0
107 stars 12 forks source link

Spiders #101

Closed Yomguithereal closed 9 years ago

Yomguithereal commented 9 years ago

Ping @boogheta, seriously? Wouldn't this be better than scraper since the scraper is kinda the script injected itself?

boogheta commented 9 years ago

I would go for crawler initially. Although, since artoo will be at the core of those instances, I would now rather be in favor of something a lot more significant: let's call these "droids"!

jacomyal commented 9 years ago

Though, "spider" is kind of a relevant name here, isn't it?

boogheta commented 9 years ago

Well, if you consider scrapy to be the reference (which is surprising for such a js aficionado), maybe it is. But since it merely is a metaphor of the arachnid crawling though the web, I guess the other metaphor of droids accompanying you on your tie-fighter when battling against the evil empire of anti-scraping is also very relevant.

paulgirard commented 9 years ago

What a spider would do in a sandcrawler ? I would vote for Jawa. Droid is fair enough.

jacomyal commented 9 years ago

@boogheta It actually appears that Scrappy (which I am not familiar with at all) is actually not the only reference. I summon Wikipedia as the main witness! http://en.wikipedia.org/wiki/Web_spider

@paulgirard Also, I am pretty sure there actually are spiders in sandcrawlers. Authors probably just forgot to mention them, I guess...

jacomyal commented 9 years ago

Also http://starwars.wikia.com/wiki/Spider

Yomguithereal commented 9 years ago

Ere the fact thou art @boogheta. Behold the mighty argument @paulgirard.