issues
search
ArchiveTeam
/
NewsGrabber
Grabbing all news.
62
stars
32
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
The IRC bot should reply to both private chat and public channel if a user has given commands though private chat to the IRC bot
#50
Arkiver2
opened
8 years ago
0
Print a warning in the IRC bot channel if no `regex` or `urls` is given for a service file
#49
Arkiver2
opened
8 years ago
0
The IRC bot should reply to a command in the channel the command is given
#48
Arkiver2
closed
8 years ago
0
Support selenium for better video and application extraction
#47
Arkiver2
opened
8 years ago
1
Custom URL extraction for found URLs that are being grabbed
#46
Arkiver2
opened
8 years ago
0
Add standard list of regexes for videoURLs and liveURLs
#45
Arkiver2
closed
8 years ago
0
Prevent static page requisites from being regrabbed
#44
Arkiver2
opened
8 years ago
1
Add an option for special URL extraction rules for seed URLs of a service
#43
Arkiver2
opened
8 years ago
0
last_upload_* files are sometimes not saved well during a bad event like a crash
#42
Arkiver2
closed
7 years ago
2
Completely remove services dir when starting the scripts
#41
Arkiver2
opened
8 years ago
1
Do not crash on bad Python service files
#40
Arkiver2
opened
8 years ago
2
Grab seed URL in which new URL is found and the domain URL
#39
Arkiver2
opened
8 years ago
0
Automatically percent encode seed URLs
#38
Arkiver2
opened
8 years ago
1
5 wikidata links (including two newly created items)
#37
JesseWeinstein
closed
8 years ago
0
add CCTV中文国际 youtube channel
#36
espes
closed
8 years ago
0
Adding 17 Finnish newspapers
#35
ersi
closed
8 years ago
0
add cankaoxiaoxi.com
#34
espes
closed
8 years ago
2
add news24.com
#33
espes
closed
8 years ago
2
Yet more Wikidata links
#32
JesseWeinstein
closed
8 years ago
0
Added Coventry Observer
#31
djsmiley2k
closed
8 years ago
0
Add rawstory (no support for videos yet)
#30
JesseWeinstein
closed
8 years ago
1
More Wikidata links
#29
JesseWeinstein
closed
8 years ago
0
Wikidata links
#28
JesseWeinstein
closed
8 years ago
0
Wikidata links
#27
JesseWeinstein
closed
8 years ago
0
Add a basic .gitignore with generated files
#26
JesseWeinstein
closed
8 years ago
0
Add a HTML display of the services
#25
JesseWeinstein
closed
8 years ago
0
URL from RSS sometimes end with one or multiple spaces
#24
Arkiver2
closed
8 years ago
1
Write new URLs to 'list' file and memory after !stop command
#23
Arkiver2
closed
8 years ago
1
Implement !start to undo !stop.
#22
Arkiver2
closed
8 years ago
1
Some URLs are regrabbed.
#21
Arkiver2
closed
8 years ago
1
Restart upload item count for WARCs if date of latest upload is not date of uploading items.
#20
Arkiver2
closed
8 years ago
1
Youtube-dl is sometimes not starting.
#19
Arkiver2
opened
8 years ago
2
Ready to merge
#18
djsmiley2k
closed
8 years ago
4
Split items on Internet Archive in 10 GB items
#17
Arkiver2
closed
8 years ago
1
Add Der Spiegel
#16
phuzion
closed
8 years ago
0
Add Akron Beacon Journal
#15
phuzion
closed
8 years ago
0
All ?? Mirror Trinity UK Sites
#14
djsmiley2k
closed
8 years ago
3
Support videos and liveblogs not indicated in the URL
#13
PressStartandSelect
opened
8 years ago
1
Support ignoreregex for each site
#12
PressStartandSelect
opened
8 years ago
0
Add Akron Beacon Journal
#11
phuzion
closed
8 years ago
1
Added coventry telegraph
#10
djsmiley2k
closed
8 years ago
1
Bot shouldn't crash if a badly named service file is committed
#9
phuzion
opened
8 years ago
1
Add Cleveland.com
#8
phuzion
closed
8 years ago
3
Remove service from old refresh list if refresh is changed.
#7
Arkiver2
closed
8 years ago
1
German-language newssites
#6
PromyLOPh
opened
8 years ago
2
Implement IRC bot for the terminal
#5
Arkiver2
closed
8 years ago
1
YouTube-DL has stopped working again
#4
HarryC145
closed
8 years ago
1
Are we logging timestamps into the WARC file name?
#3
HarryC145
closed
8 years ago
1
The Intercept
#2
joepie91
closed
8 years ago
1
added nrk.no
#1
atluxity
closed
8 years ago
1
Previous
Next