scrapy / scrapyd

A service daemon to run Scrapy spiders
https://scrapyd.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
2.96k stars 569 forks source link

The log generated by scrapyd is different from the log file generated in scrapy #373

Closed moonlighf closed 3 years ago

moonlighf commented 4 years ago

I have multiple different spdiers in a scrapy project. I want to output their logs to different files. Now I have rewritten custom_settings in each scrapy spider,just like the following code

settings = get_project_settings()
today = time.strftime("%Y-%m-%d", time.localtime())
custom_settings = {
    'LOG_ENABLED': True,
    'LOG_FILE': settings.get("LOG_ABS_PATH") + '/logs/Index_' + today + '.log',
    'LOG_FORMAT': settings.get("LOG_FORMAT"),
    'LOG_LEVEL': logging.INFO,
    'LOG_STDOUT': True
}

In this way, I can indeed output logs to different files, but I found that scrapyd cannot output all the information, just like the code below

2020-04-24 21:02:21 - /home/work/anaconda3/lib/python3.6/site-packages/scrapy/utils/log.py[line:146] - INFO: Scrapy 1.5.0 started (bot: MediaIndex)
2020-04-24 21:02:21 - /home/work/anaconda3/lib/python3.6/site-packages/scrapy/utils/log.py[line:149] - INFO: Versions: lxml 3.7.2.0, libxml2 2.9.3, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 17.9.0, Python 3.6.3 |Anaconda custom (64-bit)| (default, Nov  9 2017, 00:19:18) - [GCC 7.2.0], pyOpenSSL 17.2.0 (OpenSSL 1.0.2p  14 Aug 2018), cryptography 2.0.3, Platform Linux-3.18.6-2.el7.centos.x86_64-x86_64-with-centos-7.3.1611-Core
2020-04-24 21:02:21 - /home/work/anaconda3/lib/python3.6/site-packages/scrapy/crawler.py[line:38] - INFO: Overridden settings: {'BOT_NAME': 'MediaIndex', 'CONCURRENT_REQUESTS': 32, 'DOWNLOAD_DELAY': 2, 'LOG_FILE': '/home/work/fuzheng/09.Media_Index/MediaIndex/MediaIndex/logs/TT_Index_2020-04-24.log', 'LOG_FORMAT': '%(asctime)s - %(pathname)s[line:%(lineno)d] - %(levelname)s: %(message)s', 'LOG_LEVEL': 20, 'LOG_STDOUT': True, 'NEWSPIDER_MODULE': 'MediaIndex.spiders', 'SPIDER_MODULES': ['MediaIndex.spiders']}

But actually there is a lot of information output to the scrapy log file. What should I do so that I can see all the output in the log in scrapyd?

jpmckinney commented 3 years ago

'LOG_STDOUT': True means logging to standard output: that is, not to a log file. Set it to False and Scrapyd will by default write log files to separate files, under the configured logs_dir.