sangaline wayback-machine-scraper issues

sangaline / wayback-machine-scraper

A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.

http://sangaline.com/post/wayback-machine-scraper/

ISC License

423 stars 74 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Not scraping any page

#22 josylad opened 5 months ago
0
Fixed issues with wayback machine scraper

#21 anikafuloria opened 9 months ago
0
'ExecutionEngine' object has no attribute 'schedule'

#20 Yash-Vekaria opened 10 months ago
1
Error 429 + Scraper gives up

#19 avelican opened 1 year ago
2
Broken with Scrapy 2.x

#18 avelican opened 1 year ago
0
Save snapshots .html instead of .snapshot

#17 raphaelmerx opened 1 year ago
0
'wayback-machine-scraper' is not recognized as an internal or external command, operable program or batch file.

#16 FreeBSoD opened 3 years ago
2
snapshot functionality for a full site at a given time?

#15 DOSull opened 3 years ago
0
Escape output paths on Windows

#14 sangaline closed 3 years ago
0
[Question] How to get latest crawl?

#13 santoshbs opened 3 years ago
1
Import Error: No module named request

#12 philwild2 closed 3 years ago
2
Inspired by warrick ?

#11 sandrobilbeisi closed 3 years ago
1
Would it be possible to add a functionality to download a screenshot?

#10 alexgarciab closed 3 years ago
1
Improvements

#8 vvelikodny closed 5 years ago
0
Seems to be non functional

#7 bombledmonk closed 3 years ago
18
Error with setup

#6 mrme44 closed 3 years ago
2
Following image links

#5 ellyjonez opened 6 years ago
2
Crashes (includes fix)

#4 Cerno-b closed 3 years ago
1
How can I use this to get the number of times a site is crawled by the wayback?

#3 khantoocool closed 6 years ago
2
ImportError: cannot import name timezone

#2 dannymichel closed 6 years ago
2
Compatibility?

#1 ghost closed 6 years ago
3