Open Neutrino3316 opened 5 years ago
pyspider:latest
# mongo docker run --name mongo_pyspider -d -p 27017:27017 mongo:latest # rabbitmq docker run --name rabbitmq_pyspider -d rabbitmq:latest # phantomjs docker run --name pyspider_phantomjs -d binux/pyspider:latest phantomjs # result worker docker run --name pyspider_result_worker -d --link mongo_pyspider:mongo --link rabbitmq_pyspider:rabbitmq binux/pyspider:latest result_worker # processor, run multiple instance if needed. docker run --name pyspider_processor -d --link mongo_pyspider:mongo --link rabbitmq_pyspider:rabbitmq binux/pyspider:latest processor # fetcher, run multiple instance if needed. docker run --name pyspider_fetcher -d --link pyspider_phantomjs:phantomjs --link rabbitmq_pyspider:rabbitmq binux/pyspider:latest fetcher # scheduler docker run --name pyspider_scheduler -d --link mongo_pyspider:mongo --link rabbitmq_pyspider:rabbitmq binux/pyspider:latest scheduler # webui docker run --name pyspider_webui -d -p 5001:5000 --link mongo_pyspider:mongo --link rabbitmq_pyspider:rabbitmq --link pyspider_scheduler:scheduler --link pyspider_phantomjs:phantomjs binux/pyspider:latest webui
Using the docker run command above to run docker. All docker containers are running normally, everythin seems to be OK.
Webui is running, and I start a new project named as "test", paste the project code as the following:
from pyspider.libs.base_handler import * class Handler(BaseHandler): crawl_config = { } @every(minutes=24 * 60) def on_start(self): self.crawl('http://scrapy.org/', callback=self.index_page) @config(age=10 * 24 * 60 * 60) def index_page(self, response): for each in response.doc('a[href^="http"]').items(): self.crawl(each.attr.href, callback=self.detail_page) def detail_page(self, response): return { "url": response.url, "title": response.doc('title').text(), }
And then back to the dashboard of pyspider, set the project to DEBUG or RUNNING, and then run the project.
The project should be running and fetching data.
The RUN button in the dashboard became red when I click it.
The project isn't running, all pyspider modules seems to be alright, except the scheduler module.
It says that the project is unknown, its log is as the following:
[I 190825 05:53:47 scheduler:647] scheduler starting..., [I 190825 05:53:47 scheduler:782] scheduler.xmlrpc listening on 0.0.0.0:23333, [I 190825 05:53:48 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0, [E 190825 05:54:11 scheduler:306] unknown project: test, [I 190825 05:54:48 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0, [I 190825 05:55:48 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0, [I 190825 05:56:48 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0, [I 190825 05:57:48 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0, [I 190825 05:58:48 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
This bug won't happend if I use mysql instead of mongo. Am I doing right when connecting the mongo database? The key bash sentences are the following:
# mongo docker run --name mongo_pyspider -d -p 27017:27017 mongo:latest # scheduler docker run --name pyspider_scheduler -d --link mongo_pyspider:mongo --link rabbitmq_pyspider:rabbitmq binux/pyspider:latest scheduler
I have the same problem when I use mysql as well.
pyspider:latest
(sha256:c702f13456789a67f5abe06e37d2f38e4175010dc073c486fa646aca53dc612f)What I did before the bug started
Using the docker run command above to run docker. All docker containers are running normally, everythin seems to be OK.
Webui is running, and I start a new project named as "test", paste the project code as the following:
And then back to the dashboard of pyspider, set the project to DEBUG or RUNNING, and then run the project.
Expected behavior
The project should be running and fetching data.
Actual behavior
The RUN button in the dashboard became red when I click it.
The project isn't running, all pyspider modules seems to be alright, except the scheduler module.
It says that the project is unknown, its log is as the following:
Some clues
This bug won't happend if I use mysql instead of mongo. Am I doing right when connecting the mongo database? The key bash sentences are the following: