Closed jdespatis closed 5 years ago
Well, I've tried also spider keeper that is working (the GUI), but didn't want to send egg to sracpyd I've fixed the issue by forcing scrapyd to use the exact same version of python as the version my scrapers use, and spiderkeeper now works completely
And as a result, scrapydweb works also now, no more error 500
I guess there's a problem with scrapydweb, that could have a GUI that works at least, even if scrapyd is badly configured But everything works now :)
The key is that you should split this argument passed in: --scrapyd_server=scrapyd:6800
This works for me:
FROM python:3.6-jessie
ENV TZ="Europe/Paris"
WORKDIR /app
RUN pip install scrapydweb
RUN cp /usr/local/lib/python3.6/site-packages/scrapydweb/default_settings.py /app/scrapydweb_settings_v7.py
EXPOSE 5000
CMD ["scrapydweb", "--disable_auth", "--disable_logparser", "--scrapyd_server", "IP-OF-YOUR-SCRAPYD-SERVER:6800"]
ubuntu@ubuntu:~/docker$ sudo docker build -t scrapydweb:latest .
ubuntu@ubuntu:~/docker$ sudo docker run -d -p 5000:5000 scrapydweb
1da5a344b172f5e2d22f8e34a2ba0733c26e4e87be39c266c3ecc9a34eb41802
ubuntu@ubuntu:~/docker$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1da5a344b172 scrapydweb "scrapydweb --disabl…" 16 seconds ago Up 15 seconds 0.0.0.0:5000->5000/tcp amazing_edison
ubuntu@ubuntu:~/docker$ sudo docker logs 1da
[2019-01-21 09:02:38,892] INFO in werkzeug: * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
[2019-01-21 09:03:31,004] INFO in werkzeug: 172.17.0.1 - - [21/Jan/2019 09:03:31] "GET / HTTP/1.1" 302 -
[2019-01-21 09:03:32,143] INFO in werkzeug: 172.17.0.1 - - [21/Jan/2019 09:03:32] "GET /1/dashboard/ HTTP/1.1" 200 -
[2019-01-21 09:03:32,660] INFO in werkzeug: 172.17.0.1 - - [21/Jan/2019 09:03:32] "GET /static/v110/css/style.css HTTP/1.1" 200 -
Well, I've tried also spider keeper that is working (the GUI), but didn't want to send egg to sracpyd I've fixed the issue by forcing scrapyd to use the exact same version of python as the version my scrapers use, and spiderkeeper now works completely
And as a result, scrapydweb works also now, no more error 500
I guess there's a problem with scrapydweb, that could have a GUI that works at least, even if scrapyd is badly configured But everything works now :)
So, you were running another app on the same port 5000 when the error 500 raised?
Not another app on same port 5000. Indeed, in my config, everything is in a docker-compose, each micro service running in a separate area
Thanks for settings, but I've tried yesterday, and problem was the same indeed
Nevermind, all is working now scrapyd is goodly configured, thanks! Need some stuff though, log parser as another micro service, reverse proxy everything for auth / auto ssl, etc.
I knew you may be using docker-compose from the name 'scrapydweb_1'.
Actually, I was wondering why ScrapydWeb would raise the exception below. When the code reached line 98, it had fetched the page content from somewhere like 'http://127.0.0.1:6800/jobs', and everything should be working well.
@jdespatis You can also pass in the argument '--verbose' for trouble shooting if needed.
scrapydweb_1 | File "/usr/local/lib/python3.6/site-packages/scrapydweb/jobs/dashboard.py", line 98, in generate_response
scrapydweb_1 | _url_items = re.search(r"href='(.*?)'>", row['items']).group(1)
scrapydweb_1 | AttributeError: 'NoneType' object has no attribute 'group'
I'l testing scrapydweb on a docker, but it doesn't work, I must miss something I guess
I get indeed an error 500: 'NoneType' object has no attribute 'group'
Basically, here is my Dockerfile:
And here is the full logs of scrapydweb: when I go on localhost:5000
Any idea how to fix this ?
Thanks