issues
search
taganaka
/
polipus
Polipus: distributed and scalable web-crawler framework
MIT License
92
stars
32
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Edit regular expression in charge of removing anchor, simply add 'colon'
#72
ABrisset
opened
9 years ago
0
Unicode pages does not work anymore on 0.5.0
#71
nengine
opened
9 years ago
9
Propagate user_data values from redis message to fetched pages.
#70
stefanofontanelli
closed
9 years ago
0
Update AUTHORS.md
#69
lepek
closed
9 years ago
0
Issue #67 MongoDB driver ~> 2.0.6 compatible.
#68
lepek
closed
9 years ago
2
Make it work with Mongo 2.x
#67
lepek
closed
9 years ago
2
Support for headless crawling
#66
sandeepravi
opened
9 years ago
0
[Question] Using MongoDB as 'cache' aside of Redis
#65
anawolak
closed
9 years ago
0
Check for invalid uri in redirects
#64
pcboy
closed
9 years ago
0
If rethinkdb can't find the db, create it
#63
pcboy
closed
9 years ago
2
added the ability to use list user agents
#62
parallel588
closed
9 years ago
1
Fixes #40 invalid byte sequence in page#to_absolute
#61
pcboy
closed
9 years ago
3
Revert 59 fix utf8 support
#60
pcboy
closed
9 years ago
1
Fix #58: Support for non UTF-8 pages.
#59
pcboy
closed
9 years ago
1
Support for other charsets than UTF-8
#58
pcboy
closed
9 years ago
1
Anchor links converted to %23 causing 404 errors
#57
ABrisset
opened
9 years ago
0
RethinkDB Storage
#56
nofxx
closed
9 years ago
0
RethinkDB Storage
#55
nofxx
closed
9 years ago
2
Update gems & organize spec
#54
nofxx
closed
9 years ago
3
Use svg versions of badges [ci skip]
#53
tmaier
closed
10 years ago
1
Cannot use with mongoid ~> 4.0.0
#52
nengine
opened
10 years ago
5
Fix cookie acceptance when Set-Cookie is nil
#51
Proffard
closed
10 years ago
3
SocketError could mean, domain is gone or no internet connection
#50
tmaier
opened
10 years ago
0
Add support for all redirect status codes
#49
tmaier
closed
10 years ago
2
Cannot install on JRuby 1.7.13. Error with bson_ext-1.9.2
#48
nengine
opened
10 years ago
9
Kill s3 entirely, use Fog, yo!
#47
taganaka
opened
10 years ago
0
[WIP] Plugins architecture proposal
#46
taganaka
opened
10 years ago
4
QueueOverflow refactoring
#45
taganaka
closed
10 years ago
7
Fix typo in Cleaner plugin
#44
janpieper
closed
10 years ago
2
Whitelist start urls?
#43
janpieper
opened
10 years ago
1
removing s3 storage from mainstream
#42
taganaka
closed
10 years ago
2
# encoding: UTF-8 for all
#41
taganaka
closed
10 years ago
0
invalid byte sequence in US-ASCII (ArgumentError)
#40
taganaka
closed
9 years ago
0
Faster and easier overflow management
#39
taganaka
closed
10 years ago
1
proper initialization of internal_queue
#38
taganaka
closed
10 years ago
1
Refactor Crawler
#37
tmaier
opened
10 years ago
6
Fails when response["Set-Cookie"] is nil
#36
tmaier
closed
10 years ago
3
rubocop was here
#35
taganaka
closed
10 years ago
2
INT / TERM Signal handling
#34
taganaka
closed
10 years ago
1
Track url added by #add_url
#33
tmaier
opened
10 years ago
3
Create CHANGELOG.md
#32
tmaier
closed
10 years ago
1
Add example for error handling. #15 [ci skip]
#31
tmaier
closed
10 years ago
5
Support for robots.txt
#30
taganaka
closed
10 years ago
3
On page error
#29
taganaka
closed
10 years ago
10
Minor changes to AUTHORS, README.rdoc and Gemfile
#28
tmaier
closed
10 years ago
2
Not storable on error
#27
tmaier
closed
10 years ago
4
Better http compression support
#26
taganaka
closed
10 years ago
1
Better rescue
#25
taganaka
closed
10 years ago
2
Add #add_to_queue
#24
tmaier
closed
10 years ago
2
Add referer and depth to Page if HTTP#fetch_pages has an error
#23
tmaier
closed
10 years ago
2
Next