issues
search
sathish316
/
scrapify
ScrApify is a library to build APIs by scraping static sites and use data as models or JSON APIs. It powers APIfy which is used to create JSON APIs from any html or wikipedia page
http://apify.heroku.com/resources
143
stars
16
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Convert scraped HTML content to XML
#31
sathish316
closed
11 years ago
0
Add an #xml method that switches between Nokogiri parsers
#30
jalada
closed
3 years ago
4
pagination using array of pages
#29
sathish316
opened
12 years ago
0
Download media (mp4, pdf etc) using crawler
#28
sathish316
closed
11 years ago
0
Attribute data types
#27
sathish316
opened
12 years ago
0
Extract category attribute
#26
sathish316
closed
11 years ago
0
Replace <br> tags in content with newline
#25
sathish316
closed
12 years ago
1
Export crawled content to sql database
#24
sathish316
closed
11 years ago
0
Export crawled content to sqlite db
#23
sathish316
closed
11 years ago
0
Export crawled content to csv
#22
sathish316
closed
11 years ago
0
find by id should crawl detailed content
#21
sathish316
closed
11 years ago
0
Support for Login with session
#20
sathish316
opened
12 years ago
0
Support for Basic Authentication
#19
sathish316
opened
12 years ago
2
Extract content loaded using ajax or javascript
#18
sathish316
closed
11 years ago
0
pagination using next page selector
#17
sathish316
opened
12 years ago
0
pagination using placeholder for page and range/array of pages
#16
sathish316
opened
12 years ago
0
find all with pagination
#15
sathish316
closed
11 years ago
0
conditions with regex like
#14
sathish316
closed
11 years ago
0
conditions with <,=,> operators
#13
sathish316
closed
11 years ago
0
find all with conditions
#12
sathish316
opened
12 years ago
0
#5 | Page object instead of Class including Scrapify::Base
#11
kalarani
closed
12 years ago
1
Accept blocks while defining attributes
#10
kalarani
closed
12 years ago
0
Tolerance to malformed XML
#9
franciscolourenco
opened
12 years ago
1
Use index as key
#8
sathish316
closed
11 years ago
0
Pagination using next page selector
#7
sathish316
closed
11 years ago
0
Support multiple html pages
#6
sathish316
closed
12 years ago
3
Page object instead of Class including Scrapify::Base
#5
sathish316
closed
12 years ago
0
Validate xpath selector syntax
#4
sathish316
closed
12 years ago
0
Validate css selector syntax
#3
sathish316
closed
12 years ago
0
Support arrays using parent and child selectors
#2
sathish316
closed
12 years ago
1
HTTP Cache headers for the response from the parent URI
#1
selvakn
closed
12 years ago
0