scrapy / scrapyd

A service daemon to run Scrapy spiders
https://scrapyd.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
2.93k stars 570 forks source link

error when deploying !!! #231

Closed crisfan closed 2 years ago

crisfan commented 7 years ago

when I was deploying my project, there was a error that I coudn't resovle! error message: {"status": "error", "message": "environment can only contain strings", "node_name": "XSOOY-PC"}

server message:

 Traceback (most recent call last):
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\http.py", line 1694, in allContentReceived
            req.requestReceived(command, path, version)
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\http.py", line 790, in requestReceived
            self.process()
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\server.py", line 189, in process
            self.render(resrc)
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\server.py", line 238, in render
            body = resrc.render(self)
        --- <exception caught here> ---
          File "c:\program files (x86)\python27\lib\site-packages\scrapyd\webservice.py", line 21, in render
            return JsonResource.render(self, txrequest).encode('utf-8')
          File "c:\program files (x86)\python27\lib\site-packages\scrapyd\utils.py", line 20, in render
            r = resource.Resource.render(self, txrequest)
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\resource.py", line 250, in render
            return m(request)
          File "c:\program files (x86)\python27\lib\site-packages\scrapyd\webservice.py", line 86, in render_POST
            spiders = get_spider_list(project, version=version)
          File "c:\program files (x86)\python27\lib\site-packages\scrapyd\utils.py", line 132, in get_spider_list
            proc = Popen(pargs, stdout=PIPE, stderr=PIPE, env=env)
          File "c:\program files (x86)\python27\lib\subprocess.py", line 390, in __init__
            errread, errwrite)
          File "c:\program files (x86)\python27\lib\subprocess.py", line 640, in _execute_child
            startupinfo)
        exceptions.TypeError: environment can only contain strings

what should I do?

Digenis commented 7 years ago

@crisfan it could be similar to #212. Do you have a version scheme configured in your scrapy.cfg?

crisfan commented 7 years ago

@Digenis, I think my problem not be similar to #212 . this is my scrapy.cfg [settings] default = zzh.settings

[deploy:127] url = http://localhost:6800/ project = zzh

when I deploy my project, I receive this version:

Packing version 1493965313
Deploying to project "zzh" in http://localhost:6800/addversion.json
Server response (200):
{"status": "error", "message": "environment can only contain strings", "node_name": "XSOOY-PC"}

there is a strange thing, when I deploy my project on linux rather than windows, it works well!!!

crisfan commented 7 years ago

scrapyd error:

2017-05-05 14:21:55+0800 [HTTPChannel,3,127.0.0.1] Unhandled Error
        Traceback (most recent call last):
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\http.py", line 1694, in allContentReceived
            req.requestReceived(command, path, version)
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\http.py", line 790, in requestReceived
            self.process()
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\server.py", line 189, in process
            self.render(resrc)
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\server.py", line 238, in render
            body = resrc.render(self)
        --- <exception caught here> ---
          File "c:\program files (x86)\python27\lib\site-packages\scrapyd\webservice.py", line 21, in render
            return JsonResource.render(self, txrequest).encode('utf-8')
          File "c:\program files (x86)\python27\lib\site-packages\scrapyd\utils.py", line 20, in render
            r = resource.Resource.render(self, txrequest)
          File "c:\program files (x86)\python27\lib\site-packages\twisted\web\resource.py", line 250, in render
            return m(request)
          File "c:\program files (x86)\python27\lib\site-packages\scrapyd\webservice.py", line 86, in render_POST
            spiders = get_spider_list(project, version=version)
          File "c:\program files (x86)\python27\lib\site-packages\scrapyd\utils.py", line 132, in get_spider_list
            proc = Popen(pargs, stdout=PIPE, stderr=PIPE, env=env)
          File "c:\program files (x86)\python27\lib\subprocess.py", line 390, in __init__
            errread, errwrite)
          File "c:\program files (x86)\python27\lib\subprocess.py", line 640, in _execute_child
            startupinfo)
        exceptions.TypeError: environment can only contain strings
Digenis commented 7 years ago

@crisfan, can you start scrapyd in debug mode with -b and dump the env dictionary in get_spider_list?

zhengxs2018 commented 7 years ago

I have this problem too

# https://github.com/scrapy/scrapyd/blob/master/scrapyd/utils.py
print env
proc = Popen(pargs, stdout=PIPE, stderr=PIPE, env=env)

image

image

zhengxs2018 commented 7 years ago

use scrapyd v1.1 deploy success.

image

Digenis commented 7 years ago

@zhengxiansen, in scrapyd1.2, please replace print env with print [[n, type(n), v, type(v)] for n, v in env.items()] and copy/paste here the text output from your console (or your logfile). Please don't use screenshots, they are very impractical. Just copy/paste the text here.

zhengxs2018 commented 7 years ago

oh, i'm sorry.

[
  [
    "TMP",
    "<type 'str'>",
    "C:\\Users\\slanxy\\AppData\\Local\\Temp",
    "<type 'str'>"
  ],
  [
    "PYTHONIOENCODING",
    "<type 'str'>",
    "UTF-8",
    "<type 'str'>"
  ],
  [
    "COMPUTERNAME",
    "<type 'str'>",
    "DESKTOP-N1N1AOV",
    "<type 'str'>"
  ],
  [
    "CONEMUANSI",
    "<type 'str'>",
    "ON",
    "<type 'str'>"
  ],
  [
    "USERDOMAIN",
    "<type 'str'>",
    "DESKTOP-N1N1AOV",
    "<type 'str'>"
  ],
  [
    "SCRAPY_PROJECT",
    "<type 'str'>",
    {
      "name": "club-spiders",
      "spiders": []
    },
    "<type 'dict'>"
  ],
  [
    "CONEMUARGS2",
    "<type 'str'>",
    "",
    "<type 'str'>"
  ],
  [
    "PSMODULEPATH",
    "<type 'str'>",
    "%ProgramFiles%\\WindowsPowerShell\\Modules;C:\\WINDOWS\\system32\\WindowsPowerShell\\v1.0\\Modules",
    "<type 'str'>"
  ],
  [
    "CMDER_START",
    "<type 'str'>",
    "E:\\PycharmProjects\\crawler\\littledonkey-crawler-server",
    "<type 'str'>"
  ],
  [
    "COMMONPROGRAMFILES",
    "<type 'str'>",
    "C:\\Program Files (x86)\\Common Files",
    "<type 'str'>"
  ],
  [
    "PROCESSOR_IDENTIFIER",
    "<type 'str'>",
    "Intel64 Family 6 Model 61 Stepping 4, GenuineIntel",
    "<type 'str'>"
  ],
  [
    "PROGRAMFILES",
    "<type 'str'>",
    "C:\\Program Files (x86)",
    "<type 'str'>"
  ],
  [
    "PROCESSOR_REVISION",
    "<type 'str'>",
    "3d04",
    "<type 'str'>"
  ],
  [
    "CONEMUPID",
    "<type 'str'>",
    "20668",
    "<type 'str'>"
  ],
  [
    "HOME",
    "<type 'str'>",
    "C:\\Users\\slanxy",
    "<type 'str'>"
  ],
  [
    "NUMBER_OF_PROCESSORS",
    "<type 'str'>",
    "4",
    "<type 'str'>"
  ],
  [
    "CONEMUHWND",
    "<type 'str'>",
    "0x02560BEC",
    "<type 'str'>"
  ],
  [
    "PROGRAMFILES(X86)",
    "<type 'str'>",
    "C:\\Program Files (x86)",
    "<type 'str'>"
  ],
  [
    "PATH",
    "<type 'str'>",
    "D:\\Program Files (x86)\\cmder\\bin;D:\\Program Files (x86)\\cmder\\vendor\\conemu-maximus5\\ConEmu\\Scripts;D:\\Program Files (x86)\\cmder\\vendor\\conemu-maximus5;D:\\Program Files (x86)\\cmder\\vendor\\conemu-maximus5\\ConEmu;D:\\Program Files (x86)\\python2\\;D:\\Program Files (x86)\\python2\\Scripts;C:\\WINDOWS\\system32;C:\\WINDOWS;C:\\WINDOWS\\System32\\Wbem;C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\;C:\\Program Files\\Git\\cmd;D:\\Program Files\\nodejs\\;C:\\Users\\slanxy\\AppData\\Local\\Microsoft\\WindowsApps;C:\\Users\\slanxy\\AppData\\Roaming\\npm;d:\\Program Files (x86)\\Microsoft VS Code\\bin;C:\\Program Files\\Git\\usr\\bin;C:\\Program Files\\Git\\usr\\share\\vim\\vim74;D:\\Program Files (x86)\\cmder\\",
    "<type 'str'>"
  ],
  [
    "TERM",
    "<type 'str'>",
    "cygwin",
    "<type 'str'>"
  ],
  [
    "ANSICON_DEF",
    "<type 'str'>",
    "7",
    "<type 'str'>"
  ],
  [
    "TEMP",
    "<type 'str'>",
    "C:\\Users\\slanxy\\AppData\\Local\\Temp",
    "<type 'str'>"
  ],
  [
    "PLINK_PROTOCOL",
    "<type 'str'>",
    "ssh",
    "<type 'str'>"
  ],
  [
    "CONEMUHOOKS",
    "<type 'str'>",
    "Enabled",
    "<type 'str'>"
  ],
  [
    "PUBLIC",
    "<type 'str'>",
    "C:\\Users\\Public",
    "<type 'str'>"
  ],
  [
    "PROCESSOR_ARCHITECTURE",
    "<type 'str'>",
    "x86",
    "<type 'str'>"
  ],
  [
    "CONEMUDIR",
    "<type 'str'>",
    "D:\\Program Files (x86)\\cmder\\vendor\\conemu-maximus5",
    "<type 'str'>"
  ],
  [
    "CONEMUDRIVE",
    "<type 'str'>",
    "D:",
    "<type 'str'>"
  ],
  [
    "SYSTEMROOT",
    "<type 'str'>",
    "C:\\WINDOWS",
    "<type 'str'>"
  ],
  [
    "ALLUSERSPROFILE",
    "<type 'str'>",
    "C:\\ProgramData",
    "<type 'str'>"
  ],
  [
    "LOCALAPPDATA",
    "<type 'str'>",
    "C:\\Users\\slanxy\\AppData\\Local",
    "<type 'str'>"
  ],
  [
    "HOMEPATH",
    "<type 'str'>",
    "\\Users\\slanxy",
    "<type 'str'>"
  ],
  [
    "USERDOMAIN_ROAMINGPROFILE",
    "<type 'str'>",
    "DESKTOP-N1N1AOV",
    "<type 'str'>"
  ],
  [
    "CONEMUANSILOG",
    "<type 'str'>",
    "",
    "<type 'str'>"
  ],
  [
    "CONEMUSERVERPID",
    "<type 'str'>",
    "14812",
    "<type 'str'>"
  ],
  [
    "ALIASES",
    "<type 'str'>",
    "D:\\Program Files (x86)\\cmder\\config\\user-aliases.cmd",
    "<type 'str'>"
  ],
  [
    "USERNAME",
    "<type 'str'>",
    "slanxy",
    "<type 'str'>"
  ],
  [
    "CONEMUBUILD",
    "<type 'str'>",
    "161022",
    "<type 'str'>"
  ],
  [
    "LOGONSERVER",
    "<type 'str'>",
    "\\\\DESKTOP-N1N1AOV",
    "<type 'str'>"
  ],
  [
    "PROMPT",
    "<type 'str'>",
    "C\bL\bI\bN\bK\b \b$P$G",
    "<type 'str'>"
  ],
  [
    "COMSPEC",
    "<type 'str'>",
    "C:\\WINDOWS\\system32\\cmd.exe",
    "<type 'str'>"
  ],
  [
    "VERBOSE-OUTPUT",
    "<type 'str'>",
    "0 ",
    "<type 'str'>"
  ],
  [
    "PROGRAMDATA",
    "<type 'str'>",
    "C:\\ProgramData",
    "<type 'str'>"
  ],
  [
    "ONEDRIVE",
    "<type 'str'>",
    "C:\\Users\\slanxy\\OneDrive",
    "<type 'str'>"
  ],
  [
    "CONEMUBACKHWND",
    "<type 'str'>",
    "0x005D0A58",
    "<type 'str'>"
  ],
  [
    "CONEMUBASEDIR",
    "<type 'str'>",
    "D:\\Program Files (x86)\\cmder\\vendor\\conemu-maximus5\\ConEmu",
    "<type 'str'>"
  ],
  [
    "CONEMUWORKDIR",
    "<type 'str'>",
    "C:\\Users\\slanxy",
    "<type 'str'>"
  ],
  [
    "CONEMUPALETTE",
    "<type 'str'>",
    "Monokai",
    "<type 'str'>"
  ],
  [
    "USER-ALIASES",
    "<type 'str'>",
    "D:\\Program Files (x86)\\cmder\\config\\user-aliases.cmd",
    "<type 'str'>"
  ],
  [
    "GIT_INSTALL_ROOT",
    "<type 'str'>",
    "C:\\Program Files\\Git",
    "<type 'str'>"
  ],
  [
    "CMDER_ROOT",
    "<type 'str'>",
    "D:\\Program Files (x86)\\cmder",
    "<type 'str'>"
  ],
  [
    "PATHEXT",
    "<type 'str'>",
    ".COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC",
    "<type 'str'>"
  ],
  [
    "ARCHITECTURE",
    "<type 'str'>",
    "64",
    "<type 'str'>"
  ],
  [
    "WINDIR",
    "<type 'str'>",
    "C:\\WINDOWS",
    "<type 'str'>"
  ],
  [
    "CONEMUTASK",
    "<type 'str'>",
    "{cmd::Cmder}",
    "<type 'str'>"
  ],
  [
    "HOMEDRIVE",
    "<type 'str'>",
    "C:",
    "<type 'str'>"
  ],
  [
    "SYSTEMDRIVE",
    "<type 'str'>",
    "C:",
    "<type 'str'>"
  ],
  [
    "CONEMUARGS",
    "<type 'str'>",
    "/Icon \"D:\\Program Files (x86)\\cmder\\icons\\cmder.ico\" /Title Cmder",
    "<type 'str'>"
  ],
  [
    "CONEMUWORKDRIVE",
    "<type 'str'>",
    "C:",
    "<type 'str'>"
  ],
  [
    "APPDATA",
    "<type 'str'>",
    "C:\\Users\\slanxy\\AppData\\Roaming",
    "<type 'str'>"
  ],
  [
    "SVN_SSH",
    "<type 'str'>",
    "C:\\\\Program Files\\\\Git\\\\bin\\\\ssh.exe",
    "<type 'str'>"
  ],
  [
    "PROCESSOR_LEVEL",
    "<type 'str'>",
    "6",
    "<type 'str'>"
  ],
  [
    "PROGRAMW6432",
    "<type 'str'>",
    "C:\\Program Files",
    "<type 'str'>"
  ],
  [
    "CONEMUDRAWHWND",
    "<type 'str'>",
    "0x003402F4",
    "<type 'str'>"
  ],
  [
    "PROCESSOR_ARCHITEW6432",
    "<type 'str'>",
    "AMD64",
    "<type 'str'>"
  ],
  [
    "ANSICON",
    "<type 'str'>",
    "124x1000 (124x42)",
    "<type 'str'>"
  ],
  [
    "CONEMUCONFIG",
    "<type 'str'>",
    "",
    "<type 'str'>"
  ],
  [
    "COMMONPROGRAMW6432",
    "<type 'str'>",
    "C:\\Program Files\\Common Files",
    "<type 'str'>"
  ],
  [
    "OS",
    "<type 'str'>",
    "Windows_NT",
    "<type 'str'>"
  ],
  [
    "COMMONPROGRAMFILES(X86)",
    "<type 'str'>",
    "C:\\Program Files (x86)\\Common Files",
    "<type 'str'>"
  ],
  [
    "USERPROFILE",
    "<type 'str'>",
    "C:\\Users\\slanxy",
    "<type 'str'>"
  ]
]
rajanskumarsoni commented 7 years ago

My system is ubuntu 16.04.python support error occurred while installing scrapyd.I am not able to see any file related to scrapy in etc or usr directory.

Digenis commented 7 years ago
... 
["SCRAPY_PROJECT",                         "<type 'str'>",
 {"name": "club-spiders", "spiders": []},  "<type 'dict'>"],
...

It looks like a dict type is assigned to the SCRAPY_PROJECT env variable. I can't imagine how it got there.

redbeardcr commented 7 years ago

Hi, I also have this problem, I'm running windows 10 too - is there a solution to this?

Many thanks

andrew-werdna commented 7 years ago

Any solution in sight for this? I'm having this same problem as well. When trying to deploy my project to Scrapyd I get this response

$ scrapyd-deploy default -p delta
Packing version 1502997539
Deploying to project "delta" in http://localhost:6800/addversion.json
Server response (200):
{"status": "error", "message": "environment can only contain strings", "node_name": "web4519ip99"}

And in the terminal where Scrapyd is running I get this output

2017-08-17T14:19:01-0500 [_GenericHTTPChannelProtocol,15,127.0.0.1] Unhandled Error
        Traceback (most recent call last):
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\site-packages\twisted\web\http.py", line 2059, in allContentReceived
            req.requestReceived(command, path, version)
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\site-packages\twisted\web\http.py", line 869, in requestReceived
            self.process()
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\site-packages\twisted\web\server.py", line 184, in process
            self.render(resrc)
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\site-packages\twisted\web\server.py", line 235, in render
            body = resrc.render(self)
        --- <exception caught here> ---
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\site-packages\scrapyd\webservice.py", line 21, in render
            return JsonResource.render(self, txrequest).encode('utf-8')
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\site-packages\scrapyd\utils.py", line 20, in render
            r = resource.Resource.render(self, txrequest)
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\site-packages\twisted\web\resource.py", line 250, in render
            return m(request)
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\site-packages\scrapyd\webservice.py", line 86, in render_POST
            spiders = get_spider_list(project, version=version)
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\site-packages\scrapyd\utils.py", line 132, in get_spider_list
            proc = Popen(pargs, stdout=PIPE, stderr=PIPE, env=env)
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\subprocess.py", line 390, in __init__
            errread, errwrite)
          File "c:\users\abrown\documents\anaconda3\envs\p2\lib\subprocess.py", line 640, in _execute_child
            startupinfo)
        exceptions.TypeError: environment can only contain strings

2017-08-17T14:19:01-0500 [twisted.python.log#info] "127.0.0.1" - - [17/Aug/2017:19:19:01 +0000] "POST /addversion.json HTTP/1.1" 200 99 "-" "Python-urllib/3.6"

I'm on Windows 7 using conda 4.3.22, Python 2.7.13and here is the complete pip list of my environment if it helps...

asn1crypto (0.22.0)
attrs (17.2.0)
Automat (0.6.0)
awscli (1.11.84)
boto3 (1.4.4)
botocore (1.5.75)
certifi (2017.4.17)
cffi (1.10.0)
chardet (3.0.4)
constantly (15.1.0)
cryptography (1.9)
cssselect (1.0.1)
docutils (0.13.1)
enum34 (1.1.6)
futures (3.1.1)
hyperlink (17.2.1)
idna (2.5)
incremental (17.5.0)
ipaddress (1.0.18)
jmespath (0.9.3)
lxml (3.8.0)
parsel (1.2.0)
pip (9.0.1)
pyasn1 (0.2.3)
pyasn1-modules (0.0.9)
pycparser (2.17)
PyDispatcher (2.0.5)
pyOpenSSL (17.0.0)
pypiwin32 (219)
python-dateutil (2.6.0)
PyYAML (3.12)
queuelib (1.4.2)
requests (2.18.1)
rsa (3.4.2)
s3transfer (0.1.10)
scrapoxy (1.9)
Scrapy (1.4.0)
scrapyd (1.2.0)
scrapyd-client (1.1.0)
service-identity (17.0.0)
setuptools (27.2.0)
six (1.10.0)
Twisted (17.5.0)
urllib3 (1.21.1)
w3lib (1.17.0)
wheel (0.29.0)
zope.interface (4.4.2)

Also here is my scrapy.cfg

# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# https://scrapyd.readthedocs.org/en/latest/deploy.html

[settings]
default = delta.settings

[deploy]
url = http://localhost:6800/
project = delta
evrn1874 commented 7 years ago

经过调试发现原因就是在 scrapyd的utils文件get_spider_list方法 要确保传入Popen的env的值都是str 但是实际上 env['SCRAPY_PROJECT']和env['SCRAPY_EGG_VERSION']这两个从外面传进的参数是Unicode类型,如果是python3这个问题不大,但是在python2就要先把Unicode转为str,类似于 env['SCRAPY_PROJECT'] = str(project) env['SCRAPY_EGG_VERSION'] = str(version) 这样就不会报错了

redapple commented 7 years ago

@evrn1874 , is this related to https://github.com/scrapy/scrapyd-client/issues/49 ? Please edit your comment into English. Thanks.

evrn1874 commented 7 years ago

@redapple ,yes it is relate to https://github.com/scrapy/scrapyd-client/issues/49 I found this problem lies in the method get_spider_list() in utils.py of scrapyd we must make sure that the value of the env passed into the Popen is str But in fact env['SCRAPY_PROJECT'] and env['SCRAPY_EGG_VERSION'] these two parameters from the outside is the Unicode type, if in python3 it will fine, but in the python2 we must first convert unicode to str , similar to Env['SCRAPY_PROJECT'] = str (project) Env['SCRAPY_EGG_VERSION'] = str (version) then is ok

-

redapple commented 7 years ago

@evrn1874 , this does not explain why env['SCRAPY_PROJECT'] is a dict instead of a single character string for the users reporting this issue on Windows. If you have a way to reproduce this, it would be great if you could track down the root cause.

redapple commented 7 years ago

@evrn1874 , thanks for the hint. I was able to reproduce the issue in Python 2.7 and Windows 7. My proposed PR: https://github.com/scrapy/scrapyd/pull/242

my8100 commented 5 years ago

@Digenis What about this bug? (in Python 2.7 on Windows)

Digenis commented 5 years ago

@my8100, we still need to figure out how the dict got there.

azmirfakkri commented 4 years ago

Hi guys, I have a very similar issue when I tried to deploy scrapyd in a docker container.

I can get the active project http://localhost:6800/listprojects.json but it seems that http://localhost:6800/schedule.json returns this message:

Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/scrapyd/webservice.py", line 21, in render return JsonResource.render(self, txrequest).encode('utf-8') File "/usr/local/lib/python2.7/dist-packages/scrapyd/utils.py", line 20, in render r = resource.Resource.render(self, txrequest) File "/usr/local/lib/python2.7/dist-packages/twisted/web/resource.py", line 249, in render raise UnsupportedMethod(allowedMethods) UnsupportedMethod: Expected one of ['HEAD', 'object', 'POST']

In my terminal when I ran my script, this is the traceback message I received:

scrapyd_api.exceptions.ScrapydResponseError: Scrapyd returned an invalid JSON response: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/scrapyd/webservice.py", line 21, in render
    return JsonResource.render(self, txrequest).encode('utf-8')
  File "/usr/local/lib/python2.7/dist-packages/scrapyd/utils.py", line 20, in render
    r = resource.Resource.render(self, txrequest)
  File "/usr/local/lib/python2.7/dist-packages/twisted/web/resource.py", line 250, in render
    return m(request)
  File "/usr/local/lib/python2.7/dist-packages/scrapyd/webservice.py", line 49, in render_POST
    spiders = get_spider_list(project, version=version)
  File "/usr/local/lib/python2.7/dist-packages/scrapyd/utils.py", line 137, in get_spider_list
    raise RuntimeError(msg.encode('unicode_escape') if six.PY2 else msg)
RuntimeError: Traceback (most recent call last):\n  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main\n    "__main__", fname, loader, pkg_name)\n  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code\n    exec code in run_globals\n  File "/usr/local/lib/python2.7/dist-packages/scrapyd/runner.py", line 40, in <module>\n    main()\n  File "/usr/local/lib/python2.7/dist-packages/scrapyd/runner.py", line 37, in main\n    execute()\n  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 109, in execute\n    settings = get_project_settings()\n  File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/project.py", line 68, in get_project_settings\n    settings.setmodule(settings_module_path, priority='project')\n  File "/usr/local/lib/python2.7/dist-packages/scrapy/settings/__init__.py", line 292, in setmodule\n    module = import_module(module)\n  File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module\n    __import__(name)\nImportError: No module named settings\n

I updated scrapyd from 1.2.0 to 1.2.1 and still get the same error message as per @evrn1874 comment.

This is my requirements.txt file:

asn1crypto==0.23.0
attrs==17.3.0
Automat==0.6.0
backports.functools-lru-cache==1.5
beautifulsoup4==4.6.0
bs4==0.0.1
certifi==2017.11.5
cffi==1.11.2
chardet==3.0.4
click==6.7
constantly==15.1.0
cryptography==2.1.4
cssselect==1.0.1
enum34==1.1.6
funcsigs==1.0.2
functools32==3.2.3.post2
hyperlink==17.3.1
idna==2.6
incremental==17.5.0
ipaddress==1.0.18
lxml==4.1.1
numpy==1.16.4
pandas==0.11.0
parsel==1.2.0
Pillow==5.1.0
pluggy==0.6.0
prometheus-client==0.7.1
psycopg2-binary==2.8.3
py==1.5.2
pyasn1==0.4.2
pyasn1-modules==0.2.1
pycparser==2.18
PyDispatcher==2.0.5
PyHamcrest==1.9.0
pyOpenSSL==17.5.0
pytest==3.3.1
python-dateutil==2.7.3
python-dotenv==0.7.1
python-scrapyd-api==2.1.2
pytz==2019.1
queuelib==1.4.2
redis==2.10.6
requests==2.18.4
Scrapy==1.4.0
scrapy-crawlera==1.2.4
scrapy-prometheus==0.4.4
scrapy-splash==0.7.2
scrapyd==1.2.0
scrapyd-client==1.1.0
service-identity==17.0.0
six==1.11.0
slackclient==1.3.0
soupsieve==1.9.1
SQLAlchemy==1.3.8
Twisted==17.9.0
Unidecode==0.4.21
urllib3==1.22
w3lib==1.18.0
websocket-client==0.56.0
zope.interface==4.4.3

My scrapy.cfg:

[settings]
default = scrapingweb.settings

[deploy]
url = http://localhost:6800/
project = scrapingweb

[scrapyd]
max_proc = 1

Would appreciate any comment/pointer! Thank you.

my8100 commented 4 years ago

@azmirfakkri schedule.json only supports POST, see https://scrapyd.readthedocs.io/en/stable/api.html#schedule-json

azmirfakkri commented 4 years ago

@my8100 Thank you. I am using python-scrapyd-api and I checked that the request is indeed a POST request but still can't figure out why I'm still getting this error.

my8100 commented 4 years ago

Use Python 3 or install the latest Scrapyd from git instead.

azmirfakkri commented 4 years ago

@my8100 I have updated my project to Python 3 but it seems that problem persists. The first request made a valid POST request but the subsequent requests are GET which are invalid.

This is my scrapyd log:

reclusa_1       | 2019-10-17T13:19:41+0000 [-] Loading /usr/local/lib/python3.6/site-packages/scrapyd/txapp.py...
reclusa_1       | 2019-10-17T13:19:41+0000 [-] Scrapyd web console available at http://0.0.0.0:6800/
reclusa_1       | 2019-10-17T13:19:41+0000 [-] Loaded.
reclusa_1       | 2019-10-17T13:19:41+0000 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 19.7.0 (/usr/local/bin/python 3.6.9) starting up.
reclusa_1       | 2019-10-17T13:19:41+0000 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.
reclusa_1       | 2019-10-17T13:19:41+0000 [-] Site starting on 6800
reclusa_1       | 2019-10-17T13:19:41+0000 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7f2d71557048>
reclusa_1       | 2019-10-17T13:19:41+0000 [Launcher] Scrapyd 1.2.1 started: max_proc=16, runner='scrapyd.runner'
reclusa_1       | 2019-10-17T13:20:53+0000 [twisted.python.log#info] "192.168.48.1" - - [17/Oct/2019:13:20:52 +0000] "POST /schedule.json HTTP/1.1" 200 2049 "-" "python-requests/2.22.0"
reclusa_1       | 2019-10-17T13:30:29+0000 [twisted.python.log#info] "192.168.48.1" - - [17/Oct/2019:13:30:29 +0000] "GET / HTTP/1.1" 200 743 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36"
reclusa_1       | 2019-10-17T13:30:30+0000 [twisted.python.log#info] "192.168.48.1" - - [17/Oct/2019:13:30:30 +0000] "GET /favicon.ico HTTP/1.1" 404 153 "http://localhost:6800/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36"
reclusa_1       | 2019-10-17T13:30:33+0000 [twisted.python.log#info] "192.168.48.1" - - [17/Oct/2019:13:30:33 +0000] "GET /jobs HTTP/1.1" 200 471 "http://localhost:6800/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36"
reclusa_1       | 2019-10-17T13:30:38+0000 [twisted.python.log#info] "192.168.48.1" - - [17/Oct/2019:13:30:38 +0000] "GET /schedule.json HTTP/1.1" 200 544 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36"
my8100 commented 4 years ago

But the log shows that “GET /schedule.json” came from Chrome browser.

"GET /schedule.json HTTP/1.1" 200 544 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36"

Try to schedule your spiders with curl directly instead of scrapyd_api.

scrapyd_api.exceptions.ScrapydResponseError: Scrapyd returned an invalid JSON response: Traceback (most recent call last):
azmirfakkri commented 4 years ago

Hi @my8100 thank you!

Sorry my mistake, the GET request was actually from me trying to /schedule.json using Chrome browser.

I tried to schedule it using curl and this is the error I got:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/scrapyd/webservice.py", line 21, in render
    return JsonResource.render(self, txrequest).encode('utf-8')
  File "/usr/local/lib/python3.6/site-packages/scrapyd/utils.py", line 20, in render
    r = resource.Resource.render(self, txrequest)
  File "/usr/local/lib/python3.6/site-packages/twisted/web/resource.py", line 265, in render
    return m(request)
  File "/usr/local/lib/python3.6/site-packages/scrapyd/webservice.py", line 49, in render_POST
    spiders = get_spider_list(project, version=version)
  File "/usr/local/lib/python3.6/site-packages/scrapyd/utils.py", line 137, in get_spider_list
    raise RuntimeError(msg.encode('unicode_escape') if six.PY2 else msg)
RuntimeError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.6/site-packages/scrapyd/runner.py", line 40, in <module>
    main()
  File "/usr/local/lib/python3.6/site-packages/scrapyd/runner.py", line 37, in main
    execute()
  File "/usr/local/lib/python3.6/site-packages/scrapy/cmdline.py", line 114, in execute
    settings = get_project_settings()
  File "/usr/local/lib/python3.6/site-packages/scrapy/utils/project.py", line 68, in get_project_settings
    settings.setmodule(settings_module_path, priority='project')
  File "/usr/local/lib/python3.6/site-packages/scrapy/settings/__init__.py", line 294, in setmodule
    module = import_module(module)
  File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'facebook.settings'

So now I think the problem is caused by my own project set up :D Thank you!

jpmckinney commented 2 years ago

I'm reviewing issues about scrapyd-client in this issue tracker.

Use Python 3 or install the latest Scrapyd from git instead.

From https://github.com/scrapy/scrapyd-client/issues/49 I believe this is a Python 2 issue, which is end-of-life and is no longer supported by scrapyd-client.