scrapinghub / frontera

A scalable frontier for web crawlers
BSD 3-Clause "New" or "Revised" License
1.29k stars 215 forks source link

Metadata is not saved to db in single process mode #396

Open Prometheus3375 opened 4 years ago

Prometheus3375 commented 4 years ago

Metadata is saved in distributed mode if there is a db worker with no flag --no-incoming. When I switched to single process mode, metadata is not saved. I did not find any setting that enables it. Is it indented? According to documentation metadata should be saved.

image

Here you can find my project. I am using discovery strategy. To add seeds, run python -m frontera.utils.add_seeds --config config.single --seeds-file seeds.txt. To start crawl, run scrapy crawl spider.

sibiryakov commented 4 years ago

Hi, is sqlite file created, is anything written there?

Prometheus3375 commented 4 years ago

@sibiryakov, yes, queue, states and domain metadata tables are filling with data. Only metadata is always empty in single process mode.