Open Tobeyforce opened 3 years ago
History log:
[2021-04-08 16:20:05,034] WARNING in apscheduler: Fail to execute task #1 (upplandsbrohus sthlm 10min - edit) on node 1, would retry later: Request got {'status_code': 401, 'status': 'error', 'message': "<script>alert('Fail to login: basic auth for ScrapydWeb has been enabled');</script>"}
[2021-04-08 16:20:08,039] ERROR in apscheduler: Fail to execute task #1 (upplandsbrohus sthlm 10min - edit) on node 1, no more retries: Traceback (most recent call last):
File "/var/www/html/scrapydweb/views/operations/execute_task.py", line 89, in schedule_task
assert js['status_code'] == 200 and js['status'] == 'ok', "Request got %s" % js
AssertionError: Request got {'status_code': 401, 'status': 'error', 'message': "<script>alert('Fail to login: basic auth for ScrapydWeb has been enabled');</script>"}
[2021-04-08 16:20:40,519] WARNING in apscheduler: Shutting down the scheduler for timer tasks gracefully, wait until all currently executing tasks are finished
[2021-04-08 16:20:40,521] WARNING in apscheduler: The main pid is 1267. Kill it manually if you don't want to wait
Unfortunately running Scrapyd with gunicorn&nginx has created all kinds of problems for me, I hope you one day add an official way to deploy scrapydweb so that we don't have to create workarounds :( Without a prod server I've never had issues, so I know it would work otherwise.
My understanding is that each request goes through a middleware in run.py
@app.before_request
def require_login():
if app.config.get('ENABLE_AUTH', False):
auth = request.authorization
USERNAME = str(app.config.get('USERNAME', '')) # May be 0 from config file
PASSWORD = str(app.config.get('PASSWORD', ''))
if not auth or not (auth.username == USERNAME and auth.password == PASSWORD):
return authenticate()
My only workaround so far is to change this..
Could you debug with the following steps first?
When having auth enabled, my timer tasks stop working. The response visible in result is:
So Scrapyd is trying to send a request to Scrapydweb, but with auth it expects the basic auth, which Scrapyd does not add to the header. Is there any way to fix this? It's worth mentioning I have deployed Scrapydweb with gunicorn&nginx.
Any advice would be helpful.