issues
search
alephdata
/
memorious
Lightweight web scraping toolkit for documents and structured data.
https://docs.alephdata.org/developers/memorious
MIT License
309
stars
59
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Let reporting deal with redis
#35
danohu
closed
6 years ago
0
Remove references to CrawlerManager.path
#34
danohu
closed
6 years ago
1
Handle storing nested dictionaries in db operation
#33
sunu
closed
6 years ago
0
db operation
#32
sunu
closed
6 years ago
1
Use redis for crawler operation logging
#31
sunu
closed
6 years ago
0
Handle archives and catch more errors in fetch
#30
sunu
closed
6 years ago
0
Send errors to Sentry if a DSN is set
#29
sunu
closed
6 years ago
0
Speed up the loading of crawler index page by caching crawler stats in a separate table
#28
sunu
closed
6 years ago
1
Tests and Python 3 support
#27
sunu
closed
6 years ago
2
Handle recursion error when run without queue
#26
sunu
closed
6 years ago
2
Make memorious run under Python 3
#25
pudo
closed
6 years ago
0
Handle recursion error when run without queue
#24
pudo
closed
6 years ago
0
Aleph2 API support
#23
pudo
closed
6 years ago
0
Add UI screenshot to README
#22
patcon
closed
6 years ago
1
More coherent helpers for seach results
#21
rhiaro
opened
6 years ago
2
Generalise datastore
#20
rhiaro
closed
6 years ago
1
Use context params for aleph metadata
#19
rhiaro
closed
6 years ago
1
Make sure all methods append to `data`
#18
rhiaro
closed
6 years ago
0
Build app-level rate limiting
#17
pudo
closed
7 years ago
0
Create dev mode defaults for datastore
#16
pudo
closed
7 years ago
0
Memorious admin user interface
#15
pudo
closed
7 years ago
1
Make tags with expiration
#14
pudo
closed
7 years ago
0
OCR helper function
#13
pudo
closed
5 years ago
1
HTTP basic login (Session config) operation
#12
pudo
closed
7 years ago
0
Handle indexing of documents with parent/child relationship in Aleph_emit
#11
pudo
closed
6 years ago
0
Parse: narrow down where in the DOM to look for links
#10
rhiaro
closed
6 years ago
0
Don't fail if one crawler has bugs
#9
rhiaro
closed
7 years ago
0
Remove postgres dependency
#8
rhiaro
closed
7 years ago
3
Load new crawlers without a restart
#7
rhiaro
closed
6 years ago
5
Introduce database migrations
#6
pudo
closed
7 years ago
0
Flush all data generated by a crawler
#5
pudo
closed
7 years ago
0
Document making a basic crawler
#4
pudo
closed
6 years ago
0
Data validation stage
#3
pudo
closed
5 years ago
0
Make crawler discovery and configuration easier
#2
pudo
closed
7 years ago
1
Dockerize the tool
#1
pudo
closed
7 years ago
0
Previous