issues
search
buzzbangorg
/
bsbang-crawler
Alpha project for crawling bioschemas JSON-LD
Apache License 2.0
4
stars
5
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Investigate parallel Bioschemas Scrapy work by Federico
#18
justinccdev
opened
6 years ago
0
Crawling JavaScript inserted json-ld
#17
XiangpengHao
opened
6 years ago
0
Master
#16
justinccdev
closed
6 years ago
0
Ids for indexing docs in Solr
#15
innovationchef
opened
6 years ago
3
Slow indexing in Solr
#14
innovationchef
opened
6 years ago
1
minor fixes for last PR 10
#13
innovationchef
closed
6 years ago
0
Added script to download the crawl
#12
innovationchef
closed
6 years ago
2
add circleci config
#11
XiangpengHao
closed
6 years ago
1
Set solr endpoint manually
#10
innovationchef
closed
6 years ago
1
added corresponding commands for easier setup.
#9
aswanipranjal
closed
6 years ago
0
Setup: add corresponding parameters to the commands for easier setup
#8
aswanipranjal
closed
6 years ago
5
Crawling pages where json-ld is inserted via Javascript
#7
justinccdev
opened
6 years ago
6
Make solr endpoint a configurable option rather than hard-coded
#6
justinccdev
closed
6 years ago
3
Look at replacing most of crawler with an external crawling package
#5
justinccdev
opened
6 years ago
9
Process crawled JSON-LD to multiple levels, possibly using another library
#4
justinccdev
opened
6 years ago
3
Improve schema properties configuration mechanism
#3
justinccdev
opened
6 years ago
2
set up bsbang-crawler continuous integration
#2
justinccdev
opened
6 years ago
0
Provide a way to download the crawl
#1
justinccdev
opened
7 years ago
8