scrapinghub / hcf-backend

Crawl Frontier HCF backend
BSD 3-Clause "New" or "Revised" License
7 stars 5 forks source link

Job settings aren't passed to jobs scheduled via HCFCrawlManager #23

Closed curita closed 1 year ago

curita commented 1 year ago

Issue

No job settings are passed to jobs scheduled with HCFCrawlManager, only Frontera settings. Job settings can still be sent to this manager via the script argument --job-settings as it inherits from CrawlManager, but they aren't used.

Reproduce

I used the MyArticlesGraphManager from https://github.com/scrapinghub/shub-workflow/wiki/Graph-Managers-with-HCF (adapted to my project), and the scrapers task with the consumers didn't work as expected, as the consumer_settings weren't provided to the consumer spiders. For that example, it meant that the start requests weren't skipped.

curita commented 1 year ago

I'm checking and this probably affects --spider-args too and others.

kalessin commented 1 year ago

Hi @curita I fixed the issue https://github.com/scrapinghub/hcf-backend/commit/3ab1105763b0932dc8d5e89d9265523e54de85b1

I released version 0.5.2.1

curita commented 1 year ago

Thank you for the quick turnaround 🙇‍♀️