scrapinghub / frontera

A scalable frontier for web crawlers
BSD 3-Clause "New" or "Revised" License
1.29k stars 216 forks source link

Crawling strategy is moved to FrontierManager #331

Closed sibiryakov closed 6 years ago

sibiryakov commented 6 years ago

Mainly to ease the local debugging of crawling strategy, to avoid setting up all the dependencies required for distributed version.

codecov[bot] commented 6 years ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (master@da87cbd). Click here to learn what that means. The diff coverage is 45.38%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #331   +/-   ##
=========================================
  Coverage          ?   59.95%           
=========================================
  Files             ?       78           
  Lines             ?     6136           
  Branches          ?      809           
=========================================
  Hits              ?     3679           
  Misses            ?     2280           
  Partials          ?      177
Impacted Files Coverage Δ
frontera/core/models.py 94.73% <ø> (ø)
frontera/worker/components/batch_generator.py 57.44% <ø> (ø)
frontera/utils/graphs/data.py 81.81% <ø> (ø)
...ontera/contrib/messagebus/kafka/offsets_fetcher.py 14.89% <ø> (ø)
frontera/utils/twisted_helpers.py 78% <ø> (ø)
frontera/strategy/discovery/sitemap.py 0% <0%> (ø)
frontera/strategy/depth.py 0% <0%> (ø)
frontera/strategy/discovery/__init__.py 0% <0%> (ø)
frontera/worker/components/scoring_consumer.py 72.97% <0%> (ø)
frontera/contrib/backends/remote/messagebus.py 80% <0%> (ø)
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update da87cbd...03a5f91. Read the comment docs.

sibiryakov commented 6 years ago

Thanks for review!