Closed Redeci closed 2 years ago
Thanks @Fleget for submitting this issue. May I know which OS version you are running?
Also, if you want to use it without any modification, you can use our docker image here https://hub.docker.com/repository/docker/nusncl1/web-browsing-bot
Similar to this:
ncl@android-app-big:~$ docker run nusncl1/web-browsing-bot https://ncl.sg 0
2022-05-10 05:56:19 [scrapy.utils.log] INFO: Scrapy 2.2.0 started (bot: scrapybot)
2022-05-10 05:56:19 [scrapy.utils.log] INFO: Versions: lxml 4.5.2.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.8.4 (default, Jul 14 2020, 02:56:59) - [GCC 8.3.0], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g 21 Apr 2020), cryptography 2.9.2, Platform Linux-4.15.0-173-generic-x86_64-with-glibc2.2.5
2022-05-10 05:56:19 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2022-05-10 05:56:19 [scrapy.crawler] INFO: Overridden settings:
{'DOWNLOAD_DELAY': 5,
'HTTPCACHE_ENABLED': True,
'LOG_LEVEL': 30,
'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'}
^C2022-05-10 05:56:23 [scrapy.utils.log] INFO: Scrapy 2.2.0 started (bot: scrapybot)
2022-05-10 05:56:23 [scrapy.utils.log] INFO: Versions: lxml 4.5.2.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.8.4 (default, Jul 14 2020, 02:56:59) - [GCC 8.3.0], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g 21 Apr 2020), cryptography 2.9.2, Platform Linux-4.15.0-173-generic-x86_64-with-glibc2.2.5
I'm using Ubuntu 20.04 LTS
Hi @Fleget it is fixed and merged. Thank you!
I tried to start web-browsing-bot by following instructions in the readme file. I was able to build a docker image, but when I tried to run it via docker run command, I received this error message:
I managed to find a solution on stackoverflow, which consist of adding this code to scrapper.py, right after all imports
Now it works fine, but it seems that the CrawlerProcess function used in scrapper.py is problematic. Here is mentioned that CrawlerRunner should be used instead to solve this issue:
https://stackoverflow.com/questions/71548957/twisted-internet-error-reactoralreadyinstallederror-reactor-already-installed