issues
search
postmodern
/
spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
MIT License
798
stars
109
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add `# frozen_string_literal: true` comments to all files
#89
postmodern
closed
6 months ago
0
Switch to using `require_relative` to improve load-times
#88
postmodern
closed
6 months ago
0
Add Ruby 3.2 to the CI matrix. Update checkout action version.
#87
petergoldstein
closed
1 year ago
0
Switch to `Addressable::URI` for URI parsing
#86
postmodern
opened
1 year ago
0
fix: use the correct status code for timedout
#85
davidsauntson
closed
2 years ago
1
How to control the depth of crawling?
#82
masterbo98
closed
2 years ago
1
Support passing a URI as a proxy setting
#81
postmodern
closed
2 years ago
0
Write specs for Agent.domain
#80
postmodern
closed
2 years ago
1
Add spec for `Spidr::Agent.host`
#79
postmodern
closed
2 years ago
0
Add spec for `Spidr::Agent.site`
#78
postmodern
closed
2 years ago
0
Add spec for `Spidr::Agent.start_at`
#77
postmodern
closed
2 years ago
0
Switch to using async-http
#76
postmodern
opened
2 years ago
1
Switch to Ruby 2.x keyword arguments
#75
postmodern
closed
2 years ago
0
Figure out why specs are failing only on JRuby?
#74
postmodern
closed
3 years ago
1
fixed typo in proxy_spec
#73
andydna
closed
2 years ago
0
Add Logging
#72
postmodern
opened
4 years ago
0
Thank you
#71
thegreyfellow
closed
6 months ago
1
Doc: Syntax highlight code blocks
#70
vfonic
closed
5 years ago
4
Sitemap XML support
#69
buren
opened
5 years ago
2
Simple Command Line Interface (CLI)
#68
buren
opened
6 years ago
3
Add support for img/@src
#67
lasssim
closed
6 years ago
1
path conflicts with opaque (URI::InvalidURIError)
#66
mustiikhalil
closed
6 years ago
3
Check for opaque part of URI before attempting to set the path
#65
kyaroch
closed
6 years ago
1
`ignore_links` not working.
#64
vwochnik
closed
6 years ago
4
Lambda based delay
#63
dcadenas
closed
3 years ago
1
Add low-level HTTP request methods
#62
postmodern
opened
7 years ago
0
Add ignore_paths and ignore_paths_like
#61
postmodern
opened
7 years ago
0
unable to ignore links
#60
vanegomez
closed
7 years ago
4
Limit crawl to links matching pattern
#59
bricemaurin
closed
7 years ago
3
Respect base tags
#58
ericmason
opened
7 years ago
0
Page#to_absolut raises URI::InvalidURIError: path conflicts with opaque
#57
buren
closed
6 years ago
7
Following redirects
#56
ZackMattor
opened
7 years ago
4
Remove unused variable from example code
#55
tricknotes
closed
7 years ago
0
Use Travis' new container-based infrastructure
#54
tricknotes
closed
5 years ago
0
Session handling
#53
heavysixer
closed
7 years ago
1
Fix warning instance variable @robots not initialized
#52
spk
closed
7 years ago
2
Fix shadowing outer local variable - key
#51
spk
closed
7 years ago
0
Remove assigned but unused variable host and port
#50
spk
closed
7 years ago
0
Skip processing of pages
#49
darkcode85
closed
7 years ago
1
Hack solution until https://github.com/bblimke/webmock/issues/642 is resolved
#48
JoshCheek
closed
7 years ago
1
Add default_headers option
#47
maccman
closed
8 years ago
1
Crawling a specific page
#46
justaj
closed
8 years ago
2
/../foo expands to just "foo"
#45
postmodern
closed
8 years ago
0
Use webmock and to_rack in specs
#44
postmodern
closed
2 years ago
1
Is there a way to set Accept-Encoding headers?
#43
robfuller
closed
2 years ago
9
How can I 'ignore everything except' a set of links
#42
DHarls17
closed
8 years ago
4
Is it possible to display only part of a spidered URL?
#41
DHarls17
closed
8 years ago
3
Anyway to limit the total number of pages crawled or shutdown the crawler after some criteria?
#40
samur-vonq
closed
8 years ago
2
Adds optionable support for obeying robots.txt
#39
buren
closed
8 years ago
4
how to login via submit a form
#38
loyalpartner
closed
8 years ago
1
Next