issues
search
alephdata
/
memorious
Lightweight web scraping toolkit for documents and structured data.
https://docs.alephdata.org/developers/memorious
MIT License
311
stars
59
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
FIX res is not an attr of MimeTypeRule
#85
moreymat
closed
4 years ago
0
ENH parse uses xpath() instead of findall()
#84
moreymat
closed
4 years ago
0
FIX store bis
#83
moreymat
closed
5 years ago
1
FIX guess file extension from MIME type
#82
moreymat
closed
5 years ago
4
Small fixes for queue management
#81
pudo
closed
5 years ago
1
Memorious 1.0.0
#80
sunu
closed
5 years ago
0
Use servicelayer workers
#79
sunu
closed
5 years ago
0
Installation instructions: /bin/bash absent from alpine
#78
SylvainLapoix
closed
5 years ago
2
Frequent database deadlock errors
#77
sunu
closed
4 years ago
1
Ability to start or cancel multiple crawlers at a time
#76
sunu
closed
3 years ago
1
Make sure crawler status actively updates periodically instead of relying on page reload
#75
sunu
closed
3 years ago
1
`memorious run` command never finishes
#74
alexmojaki
closed
5 years ago
3
Implement inline OCR again
#73
pudo
closed
5 years ago
0
Running a scraper in the example fails with an error when calling context.set_tag(tag, None)
#72
alexmojaki
closed
5 years ago
5
Task queuing and rate limiting using servicelayer
#71
sunu
closed
5 years ago
0
Why is `cleanup` removed?
#70
pohnean
closed
5 years ago
4
aleph_emit op now lives in alephclient. Remove it from here
#69
sunu
closed
5 years ago
0
Let other users add and run their own crawlers on our platform
#68
sunu
closed
3 years ago
1
Add user authentication and scraper namespacing to Memorious
#67
sunu
closed
3 years ago
1
Move aleph_emit operation into alephclient
#66
sunu
closed
5 years ago
0
Reference documents from structured data scrapes
#65
pudo
closed
3 years ago
1
Implement search/filtering for scrapers UI
#64
uhhhuh
closed
3 years ago
1
Use the OCR service thorugh the ServiceLayer
#63
uhhhuh
closed
5 years ago
0
Document the `nested db` operation
#62
uhhhuh
opened
5 years ago
0
Implement Aleph bulk upload as an aggregation operation
#61
sunu
closed
4 years ago
7
Include mapping inside memorious crawler
#60
uhhhuh
closed
3 years ago
1
Crawler action button is incosistent with the current state of the crawler
#59
sunu
closed
5 years ago
0
Adopt servicelayer for redis config
#58
pudo
closed
5 years ago
0
Use servicelayer
#57
pudo
closed
5 years ago
0
Crawler 'sample' mode
#56
rhiaro
closed
4 years ago
1
Run user defined methods/aggregator operations when a crawler has finished running
#55
sunu
closed
5 years ago
0
redis <3 to accommodate fakeredis requirement
#54
rhiaro
closed
5 years ago
0
WIP: emit FtM entities into balkhash buckets
#53
sunu
closed
5 years ago
1
Store and export ftm entities as JSON
#52
sunu
closed
5 years ago
1
Fix/example
#51
todorus
closed
6 years ago
1
Docker-compose example doesn't work
#50
todorus
closed
6 years ago
5
Fix unsupported tar.gz; add support for read mode
#49
uhhhuh
closed
6 years ago
0
Show a warning if in multi-threaded mode and the datastorage is sqlite
#48
uhhhuh
closed
3 years ago
1
Button on the ui to stop a running crawler
#46
sunu
closed
6 years ago
0
Get rid of Celery and use our own task runner backed by Redis as a queue
#45
sunu
closed
6 years ago
2
Don't depend on httpbin.org for testing
#44
sunu
opened
6 years ago
0
Allow Xpath queries returning text, not just elements, in built-in parse function
#43
Rinatius
closed
4 years ago
2
Use Redis for persistence
#42
sunu
closed
6 years ago
0
Consider moving results table to redis
#41
pudo
closed
6 years ago
1
Move information from tag table to redis
#40
pudo
closed
6 years ago
1
Add data validation helpers as part of context logic
#39
uhhhuh
closed
6 years ago
0
Persuade extract to handle FTS zips
#38
rhiaro
closed
6 years ago
0
Make it possible to view errors only from the latest run
#37
sunu
closed
5 years ago
1
Make crawler cleanup and reporting a bit more robust
#36
sunu
closed
6 years ago
0
Let reporting deal with redis
#35
danohu
closed
6 years ago
0
Previous
Next