datalad / datalad-crawler

DataLad extension for tracking web resources as datasets
http://datalad.org
Other
5 stars 16 forks source link

Crawling of stanford dataspace, and simple indexes #11

Closed yarikoptic closed 3 years ago

yarikoptic commented 5 years ago
yarikoptic commented 5 years ago

some failures are due to the bug somewhere in twisted or scrapy leading to TypeError (attrib() got an unexpected keyword argument 'converter') which I recently observed elsewhere but I think it was resolved via upgrades... so not sure what to do for travis. Will fixup for the rogue pdb now

codecov-io commented 5 years ago

Codecov Report

Merging #11 into master will decrease coverage by 15.72%. The diff coverage is 74.6%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #11       +/-   ##
===========================================
- Coverage   86.44%   70.71%   -15.73%     
===========================================
  Files          51       51               
  Lines        4130     4180       +50     
===========================================
- Hits         3570     2956      -614     
- Misses        560     1224      +664
Impacted Files Coverage Δ
datalad_crawler/nodes/crawl_url.py 78.82% <100%> (-11.43%) :arrow_down:
datalad_crawler/pipeline.py 74.27% <100%> (-8.08%) :arrow_down:
...awler/pipelines/tests/test_simple_with_archives.py 54.83% <26.66%> (-45.17%) :arrow_down:
datalad_crawler/pipelines/simple_with_archives.py 75.55% <75%> (-6.27%) :arrow_down:
datalad_crawler/nodes/matches.py 89.18% <94.11%> (+1.31%) :arrow_up:
datalad_crawler/pipelines/tests/test_openfmri.py 28.08% <0%> (-63.27%) :arrow_down:
datalad_crawler/pipelines/balsa.py 34.73% <0%> (-61.06%) :arrow_down:
datalad_crawler/dbs/versions.py 45.45% <0%> (-52.28%) :arrow_down:
datalad_crawler/pipelines/tests/test_balsa.py 51.78% <0%> (-48.22%) :arrow_down:
datalad_crawler/nodes/annex.py 47.09% <0%> (-34.58%) :arrow_down:
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1684a9f...681e1b2. Read the comment docs.

yarikoptic commented 3 years ago

elderly effort. IIRC was working but datasets of interest were broken (broken tarballs iirc) anyways. And with no immediate need - abandoned. So let's let it RiP