issues
search
inspirehep
/
hepcrawl
Scrapy project for feeds into INSPIRE-HEP
http://inspirehep.net
Other
17
stars
30
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
oaipmh: gracefully parse records
#257
drjova
closed
5 years ago
1
bump inspire schemas, dojson and utils version
#256
vbalbp
closed
5 years ago
0
parsers: add arXiv parser
#255
vbalbp
closed
5 years ago
4
Ksachs arxiv spider
#254
vbalbp
closed
5 years ago
0
loaders: avoid truncation of abstracts
#253
vbalbp
closed
5 years ago
0
general: bump inspire-schemas to v58 and dojson to v60
#252
ammirate
closed
5 years ago
0
arXiv spider: collaborations
#251
ksachs
closed
5 years ago
0
hepcrawl: arxiv spider - bug fix collaborations
#250
ksachs
closed
5 years ago
4
docs: add info about formats, docker, etc.
#249
szymonlopaciuk
closed
6 years ago
1
Q: arXiv spider: collaborations
#248
ksachs
closed
5 years ago
1
spiders: add Crossref API spider
#247
vbalbp
closed
5 years ago
0
spiders: rename last_run_store module to lastrunstore_spider
#246
david-caro
closed
6 years ago
0
setup: bump inspire-dojson
#245
michamos
closed
6 years ago
1
789 requirements: fix amqp requirement, skip 2.3.0
#244
turtle321
closed
6 years ago
0
tests: fix DoJSON exceptions
#243
drjova
closed
6 years ago
2
global: change the config for celery 4
#242
jacquerie
closed
6 years ago
1
setup: bump celery to version ~4.0
#241
jacquerie
closed
6 years ago
1
crawl-once: use isinstance instead of subclass
#240
david-caro
closed
6 years ago
0
This fixes crawlonce for a known file
#239
david-caro
closed
6 years ago
1
setup: bump raven version
#238
vbalbp
closed
6 years ago
0
parsers: add crossref api parser
#237
vbalbp
closed
6 years ago
1
tohep: populate raw_affiliations, not affiliations
#236
jacquerie
closed
6 years ago
0
utils: make traceback in ParsedItem a string instead of a list
#235
ammirate
closed
6 years ago
0
Link DOIs to preferred resolver
#234
katrinleinweber
opened
6 years ago
0
setup: remove unused responses dependency
#233
jacquerie
closed
6 years ago
0
settings: reorder middlewares
#232
jacquerie
closed
6 years ago
0
spiders: remove unused spiders
#231
jacquerie
closed
6 years ago
4
global: replace get_nested util with get_value
#230
jacquerie
closed
6 years ago
0
pipelines: return a crawl_result object instead of just the record
#229
ammirate
closed
6 years ago
2
utils: remove extra get_first
#228
jacquerie
closed
6 years ago
0
utils: use passive FTP by default
#227
jacquerie
closed
6 years ago
0
desy: make scraping error JSON serializable
#226
jacquerie
closed
6 years ago
0
utils.strict_kwargs: whitelist crawler_settings param
#225
david-caro
closed
6 years ago
1
travis: deploy only on one of the test suites
#224
david-caro
closed
6 years ago
0
desyspider: return also bad harvested data
#223
Glignos
closed
6 years ago
0
parsers: relax JATS parser date handling
#222
szymonlopaciuk
opened
6 years ago
0
introduce 'source' to spiders
#221
szymonlopaciuk
closed
6 years ago
0
misc: use self.logger in the spiders
#220
szymonlopaciuk
closed
6 years ago
0
utils: strictly check kwargs in spiders (#218)
#219
szymonlopaciuk
closed
6 years ago
0
Investigate how scrapy uses the `*args` and `**kwargs` in the `__init__` of the spider and see if we can add
#218
david-caro
opened
6 years ago
0
oaipmh_spider: allow harvesting single records
#217
szymonlopaciuk
closed
6 years ago
0
cds: use the OAI-PMH spider to harvest CDS
#216
szymonlopaciuk
closed
5 years ago
1
Avoid duplicated records from cross-set fetching
#215
david-caro
closed
6 years ago
0
oai: retrieve all the records at once
#214
david-caro
closed
6 years ago
1
setup: bump inspire-dojson to 58
#213
david-caro
closed
6 years ago
0
OAI: allow harvesting single records (GetRecord oai verb)
#212
david-caro
closed
6 years ago
0
spiders: add oai spider
#211
david-caro
closed
6 years ago
0
testlibs: fix deep_sort for list case
#210
chris-asl
closed
6 years ago
0
parsers: create an NLM parser
#209
szymonlopaciuk
opened
6 years ago
0
setup: bump inspire-schemas~=57.0
#208
szymonlopaciuk
closed
6 years ago
1
Previous
Next