issues
search
medialab
/
sandcrawler
sandcrawler.js - the server-side scraping companion.
http://medialab.github.io/sandcrawler/
GNU Lesser General Public License v3.0
107
stars
12
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Can i scrape up to 50,000 pages in reasonable time ?
#193
scroobius-pip
opened
7 years ago
1
Status?
#192
brandondrew
opened
8 years ago
14
not a valid language tag
#191
ToruHyuga
opened
8 years ago
2
Future plans
#190
moshewe
opened
8 years ago
0
Promises in addition to callbacks?
#189
moshewe
opened
8 years ago
6
CasperJS support?
#188
moshewe
opened
8 years ago
2
PhantomJS library integration - phridge/node-phantomjs
#187
moshewe
opened
8 years ago
10
What if no scraper?
#186
Yomguithereal
opened
8 years ago
0
body.match is undefined
#185
kevinrademan
closed
3 years ago
2
Can you do child requests?
#184
kevinrademan
closed
3 years ago
5
A missing lib on a server caused sandcrawler to crash
#183
legaultpierre
closed
9 years ago
0
Noop flag for scraper methods (navigational edge cases)
#182
Yomguithereal
opened
9 years ago
0
Investigate some limit issues
#181
Yomguithereal
opened
9 years ago
0
Isomorphism
#180
Yomguithereal
opened
9 years ago
0
Display technologies used more conspicuously
#179
Yomguithereal
opened
9 years ago
0
Handle nasty charset polymorphism
#178
Yomguithereal
opened
9 years ago
4
Document encoding
#177
Yomguithereal
opened
9 years ago
0
Handle encoding heuristics
#176
Yomguithereal
closed
9 years ago
0
Proxy setting polymorphism through nodeUrl
#175
Yomguithereal
closed
9 years ago
0
Add internal flag for droid and jawa
#174
Yomguithereal
closed
9 years ago
0
Adjust spider default name
#173
Yomguithereal
closed
9 years ago
0
Handle discards in stats
#172
Yomguithereal
closed
9 years ago
0
Add a function to addUrlNow
#171
boogheta
closed
9 years ago
1
Example of json usage
#170
Yomguithereal
opened
9 years ago
0
Passing context to scraper function as last argument
#169
Yomguithereal
opened
9 years ago
1
Add encoding support
#168
Yomguithereal
closed
9 years ago
0
Iterate index definition is incorrect
#167
Yomguithereal
closed
9 years ago
0
Experiment with onInitialized and shims/sniffers
#166
Yomguithereal
opened
9 years ago
0
Documentation site pull requests
#165
nucleardreamer
opened
9 years ago
1
Check throttle
#164
Yomguithereal
closed
9 years ago
0
Check retryNow
#163
Yomguithereal
closed
9 years ago
0
Adjust the name string
#162
Yomguithereal
closed
9 years ago
0
Publish newer version to NPM
#161
prokilogrammer
closed
9 years ago
5
Phantomjs memory leak
#160
Yomguithereal
opened
9 years ago
1
Cheerio options
#159
Yomguithereal
closed
9 years ago
0
Document stats
#158
Yomguithereal
opened
9 years ago
0
Fault tolerance
#157
Yomguithereal
opened
9 years ago
0
job:end should expose status
#156
Yomguithereal
closed
9 years ago
0
Autoretry later/now polymorphism
#155
Yomguithereal
closed
9 years ago
0
Add page-level events
#154
Yomguithereal
opened
9 years ago
0
Possibility to pass config object to the phantom engine run method.
#153
Yomguithereal
closed
9 years ago
0
Harmonize res.error
#152
Yomguithereal
closed
9 years ago
0
Should not throttle the first job
#151
Yomguithereal
closed
9 years ago
0
Dual callback style result
#150
Yomguithereal
closed
9 years ago
0
Possibility not to pass a scraper
#149
Yomguithereal
closed
9 years ago
0
phantomEngine.run won't autoClose
#148
Yomguithereal
closed
9 years ago
0
Navigation edge cases
#147
Yomguithereal
closed
9 years ago
0
Bind job to static scraper scope
#146
boogheta
closed
9 years ago
2
Example of user agent plugin
#145
Yomguithereal
opened
9 years ago
0
jQuery/cheerio pitfalls
#144
Yomguithereal
opened
9 years ago
0
Next