issues
search
adamreichold
/
umwelt-info
umwelt.info metadata index
https://umwelt.info
GNU Affero General Public License v3.0
1
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Disable usage of LZ4 compression in the index as our index will comfortably fit into memory for the forseeable future.
#47
adamreichold
closed
2 years ago
1
Add randomized delay to harvester timer to reduce unintented load spikes at the sources.
#46
adamreichold
closed
2 years ago
0
Partition sources into active and inactive based on groups to enable multiple schedules.
#45
adamreichold
opened
2 years ago
2
Use the Accept header to perform content negotiation for the search and dataset routes.
#44
adamreichold
closed
2 years ago
0
Track licenses and collect metrics about them.
#43
adamreichold
closed
2 years ago
1
Enable parking_lot usage of tokio and once_cell as we already pull it in via scraper.
#42
adamreichold
closed
2 years ago
0
Raise an error if no handles could be parsed, e.g. if the document structure changed.
#41
adamreichold
closed
2 years ago
1
Add scraper for BfS' Doris.
#40
adamreichold
closed
2 years ago
0
Provide a Debian package to automate deployment.
#39
adamreichold
closed
2 years ago
0
Add initial README to simplify onboarding
#38
adamreichold
closed
2 years ago
0
Initial harvester for Wasser-DE
#37
jakobdeller
closed
2 years ago
2
Add a comment on why Source has a manual Debug impl.
#36
adamreichold
closed
2 years ago
0
Enable an end-to-end request timeout to avoid hanging the harvester on lost requests.
#35
adamreichold
closed
2 years ago
0
Store term positions to enable phrase searches.
#34
adamreichold
closed
2 years ago
1
Implement pagination of search results to enable viewing all of them.
#33
adamreichold
closed
2 years ago
0
Add xtask infrastructure to simplify local development work flow.
#32
adamreichold
closed
2 years ago
0
How to handle changes to index schema?
#31
adamreichold
closed
2 years ago
1
Generally enable a German-language stemmer on the index to increase recall.
#30
adamreichold
closed
2 years ago
0
Add GeoNetwork harvester based on the Q Search API endpoint.
#29
adamreichold
closed
2 years ago
3
Optimize binaries for Haswell or later to better utilize our server hardware.
#28
adamreichold
closed
2 years ago
0
Extend the harvester to collect summary metrics to be displayed by the server.
#27
adamreichold
closed
2 years ago
0
Extend list of authors to have alternative points of contact available.
#26
adamreichold
closed
2 years ago
0
Transmit a user agent header so that people know who to contact if we inconvience them.
#25
adamreichold
closed
2 years ago
0
Ignore data folder to simplify local testing without danger of accidentally committing test data.
#24
adamreichold
closed
2 years ago
0
Identify duplicate datasets
#23
adamreichold
opened
2 years ago
0
Drop "dir" handle to avoid locking error on Windows
#22
jakobdeller
closed
2 years ago
0
Add a global limit to in-flight requests and load shedding to the server.
#21
adamreichold
closed
2 years ago
0
Extend the harvester to mail a summary of its activity to a configured address.
#20
adamreichold
closed
2 years ago
2
Parse the DCAP-AP.de license vocabulary to generate a build representation of common licenses.
#19
adamreichold
closed
2 years ago
2
Fold indexer into harvester
#18
adamreichold
closed
2 years ago
1
Send harvest summary via electronic mail
#17
adamreichold
closed
2 years ago
0
Use LZ4 to compress datasets as it is already part of the dependency closure.
#16
adamreichold
closed
2 years ago
1
Add harvester configuration to version control.
#15
adamreichold
closed
2 years ago
0
Keep previous snapshot of datasets around to enable manual rollbacks.
#14
adamreichold
closed
2 years ago
0
Use Rayon to load datasets for indexing using multiple threads.
#13
adamreichold
closed
2 years ago
0
Add with_retry helper function to handle spurious failures
#12
adamreichold
closed
2 years ago
1
Implement pagination of search results
#11
adamreichold
closed
2 years ago
0
Record per-dataset access statistics.
#10
adamreichold
closed
2 years ago
0
Retry requests during harvesting
#9
adamreichold
closed
2 years ago
0
Better logging of the distinction between the overall number of datasets and the actually transmitted datasets.
#8
adamreichold
closed
2 years ago
0
Explicitly handle duplicate dataset identifiers per source
#7
adamreichold
closed
2 years ago
1
Use Rayon to load datasets for indexing using multiple threads.
#6
adamreichold
closed
2 years ago
1
Reduce depedency closure and remove inefficient initial CKAN harverster
#5
adamreichold
closed
2 years ago
0
Add initial CSW harvester.
#4
adamreichold
closed
2 years ago
0
Use LZ4 compression for datasets
#3
adamreichold
closed
2 years ago
0
Add group setting to sources
#2
adamreichold
opened
2 years ago
1
Add an alternative code path to harvest CKAN sources
#1
adamreichold
closed
2 years ago
0
Previous