issues
search
alephdata
/
memorious
Lightweight web scraping toolkit for documents and structured data.
https://docs.alephdata.org/developers/memorious
MIT License
311
stars
59
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
#153 Adding documentation on how to create entities from memorious
#186
Rosencrantz
closed
2 years ago
0
Bump pantomime from 0.4.1 to 0.4.2
#185
dependabot[bot]
closed
2 years ago
0
Bump python-dateutil from 2.8.1 to 2.8.2
#184
dependabot[bot]
closed
3 years ago
1
Handle multiple queries in documentcloud_query operation
#183
sunu
closed
3 years ago
2
documentcloud operation should parse `publisher` document metadata and `aleph_emit` should be able to push it to Aleph
#182
sunu
closed
3 years ago
0
Retry to establish database and redis connection a few times before raising an error
#181
sunu
opened
3 years ago
0
Prevent the docker build/push from running on forks of memorious
#180
Rosencrantz
closed
3 years ago
2
Bump servicelayer[amazon,google] from 1.18.1 to 1.18.2
#179
dependabot[bot]
closed
3 years ago
2
Proposal: Add possibility to use ENV vars in yaml config
#178
simonwoerpel
closed
2 years ago
5
Fix weird filenames created during store stage
#177
simonwoerpel
closed
3 years ago
1
Rosencrantz/#153 media monitoring
#176
Rosencrantz
closed
2 years ago
3
ENH aleph_emit passes keywords
#175
moreymat
closed
3 years ago
0
Upgrade to GitHub-native Dependabot
#174
dependabot-preview[bot]
closed
3 years ago
0
Bump servicelayer[amazon,google] from 1.18.0 to 1.18.1
#173
dependabot-preview[bot]
closed
3 years ago
0
Introduce cleanup_archive method to delete a file from the archive after processing
#172
sunu
closed
3 years ago
0
Memorious 2.0
#171
sunu
closed
3 years ago
0
Fixes #168 update documentcloud to use new API
#170
Rosencrantz
closed
3 years ago
2
Option to run a multithreaded worker in sync mode.
#169
sunu
closed
3 years ago
0
documentcloud integration may need to be reviewed
#168
Rosencrantz
closed
3 years ago
4
WIP Rosencrantz/#153 media monitoring
#167
Rosencrantz
closed
3 years ago
3
Depth first execution in sync mode
#166
sunu
closed
3 years ago
1
Large crawlers fill up the queue and run out of memory when running in sync mode
#165
sunu
closed
3 years ago
1
Bump servicelayer[amazon,google] from 1.17.0 to 1.17.2
#164
dependabot-preview[bot]
closed
3 years ago
0
Bump servicelayer[amazon,google] from 1.17.0 to 1.17.1
#163
dependabot-preview[bot]
closed
3 years ago
1
Bump alpine from 3.13.0 to 3.13.2
#162
dependabot-preview[bot]
closed
3 years ago
1
Run a crawler from a yaml file
#161
sunu
closed
3 years ago
0
Bump alpine from 3.13.0 to 3.13.1
#160
dependabot-preview[bot]
closed
3 years ago
1
Base memorious on FtM, inline alephclient and fix WebDAV
#159
pudo
closed
3 years ago
0
Bump servicelayer[amazon,google] from 1.16.1 to 1.17.0
#158
dependabot-preview[bot]
closed
3 years ago
1
Bump pyyaml from 5.4 to 5.4.1
#157
dependabot-preview[bot]
closed
3 years ago
0
Bump pyyaml from 5.3.1 to 5.4
#156
dependabot-preview[bot]
closed
3 years ago
0
Bump alpine from 3.12 to 3.13.0
#155
dependabot-preview[bot]
closed
3 years ago
0
Indexing Atlassian Confluence
#154
pudo
opened
3 years ago
3
Support for media monitoring
#153
pudo
closed
2 years ago
1
Add file path information to dav_index
#152
uhhhuh
closed
3 years ago
0
Bump servicelayer[amazon,google] from 1.15.1 to 1.16.1
#151
dependabot-preview[bot]
closed
3 years ago
1
Make memorious run a crawler based on a yaml file
#150
sunu
closed
3 years ago
1
Structured logging
#149
sunu
closed
3 years ago
0
Correct a spelling mistake
#148
EdwardBetts
closed
3 years ago
1
make a copy of the data before emit when using fakeredis
#146
sunu
closed
3 years ago
0
Let aggregator methods to be defined as operations as well
#145
sunu
closed
3 years ago
0
Different behaviour between FakeRedis and real redis
#144
simonwoerpel
closed
3 years ago
2
Bump servicelayer[amazon,google] from 1.15.0 to 1.15.1
#143
dependabot-preview[bot]
closed
3 years ago
0
Scheduled crawler don't run.
#142
hsnamIT
closed
3 years ago
1
Add GitHub Action: ShiftLeft NextGen Static Analysis
#141
pudo
closed
3 years ago
0
Bump servicelayer[amazon,google] from 1.13.5 to 1.15.0
#140
dependabot-preview[bot]
closed
3 years ago
0
Dependabot couldn't authenticate with https://pypi.python.org/simple/
#139
dependabot-preview[bot]
closed
3 years ago
0
Bump servicelayer[amazon,google] from 1.13.5 to 1.14.1
#138
dependabot-preview[bot]
closed
3 years ago
1
Bump servicelayer[amazon,google] from 1.13.5 to 1.14.0
#137
dependabot-preview[bot]
closed
3 years ago
1
Bump pantomime from 0.4.0 to 0.4.1
#136
dependabot-preview[bot]
closed
3 years ago
0
Previous
Next