Closed amarynets closed 7 years ago
My guess is that it's point 3 that is confusing you, the /jobs html page. If the logs are there, scrapyd should be able to serve them. (don't confuse jobs with logs)
@Digenis
jobs_to_keep can't be turned off, it only falls back to a default
By scrapyd log, I mean the daemon's log, not individual spider logs.
Here a log from scrapyd:
2017-06-26T09:38:56+0000 [twisted.python.log#info] "127.0.0.1" - - [26/Jun/2017:09:38:56 +0000] "GET > /logs/exa/tc/fe6d43485a5111e7a2a1f23c910a61ba.log HTTP/1.1" 404 145 "-" "Python-urllib/3.4"
And yes, jobs_to_keep just saved log for the last N jobs, not for all jobs. Is it possible set jobs_to_keep to 1M?
yes, you can set it to a high value. was this the problem that you are reporting?
Yes, it is. Thanks a lot
ok, then it's not a bug. this is the way the /jobs endpoint is supposed to work. perhaps we can add a notice at the end of the table saying "only the last N jobs are shown"
Hello, I have a question about log files. After some time jobs are disappear from scrapyd server and I can't download log from 127.0.0.1:6800/logs/spiders/mhn/79e8c970575511e7a52b742f68d0cfee.log, but file exists in folder logs. How can I solve this?