scrapinghub / portia

Visual scraping for Scrapy
BSD 3-Clause "New" or "Revised" License
9.28k stars 1.41k forks source link

HTTPConnectionPool(host='localhost', port=6800): Max retries exceeded with url: /schedule.json #830

Open majidshokrolahi opened 6 years ago

majidshokrolahi commented 6 years ago

I did "PULL" of the docker image and I have deployed in on the Kubernetes Engine (Container engine) of the Google Cloud Platform. I could create a Spider but when i run it i receive the following error. Do you have any idea why it gives me connection error?

` at /api/projects/toscrape/spiders/books.toscrape.com/schedule HTTPConnectionPool(host='localhost', port=6800): Max retries exceeded with url: /schedule.json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb406f17ad0>: Failed to establish a new connection: [Errno 111] Connection refused',))

Request Method: POST Request URL: http://35.205.230.83/api/projects/toscrape/spiders/books.toscrape.com/schedule Django Version: 1.10.5 Python Executable: /usr/bin/python Python Version: 2.7.6 Python Path: ['/app/portia_server', '/app/portia_server', '/app/slyd', '/app/slybot', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages'] Server time: Mon, 20 Nov 2017 14:36:51 +0000 Installed Applications: ['db_repo.apps.DbRepoConfig', 'storage.apps.StorageConfig', 'portia_orm.apps.PortiaOrmConfig', 'portia_api.apps.PortiaApiConfig', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'whitenoise.runserver_nostatic', 'django.contrib.staticfiles'] Installed Middleware: ['django.middleware.security.SecurityMiddleware', 'whitenoise.middleware.WhiteNoiseMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.middleware.clickjacking.XFrameOptionsMiddleware', 'portia_orm.middleware.ORMDataStoreMiddleware']

Traceback:

File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/exception.py" in inner

  1. response = get_response(request)

File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _get_response

  1. response = self.process_exception_by_middleware(e, request)

File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in _get_response

  1. response = wrapped_callback(request, *callback_args, **callback_kwargs)

File "/usr/local/lib/python2.7/dist-packages/django/views/decorators/csrf.py" in wrapped_view

  1. return view_func(*args, **kwargs)

File "/usr/local/lib/python2.7/dist-packages/rest_framework/viewsets.py" in view

  1. return self.dispatch(request, *args, **kwargs)

File "/usr/local/lib/python2.7/dist-packages/django/utils/decorators.py" in inner

  1. return func(*args, **kwargs)

File "/app/portia_server/portia_api/resources/route.py" in dispatch

  1. return super(JsonApiRoute, self).dispatch(request, *args, **kwargs)

File "/usr/local/lib/python2.7/dist-packages/rest_framework/views.py" in dispatch

  1. response = self.handle_exception(exc)

File "/app/portia_server/portia_api/resources/route.py" in handle_exception

  1. response = super(JsonApiRoute, self).handle_exception(exc)

File "/usr/local/lib/python2.7/dist-packages/rest_framework/views.py" in handle_exception

  1. self.raise_uncaught_exception(exc)

File "/usr/local/lib/python2.7/dist-packages/rest_framework/views.py" in dispatch

  1. response = handler(request, *args, **kwargs)

File "/app/portia_server/portia_api/resources/spiders.py" in schedule

  1. request = requests.post(settings.SCHEDULE_URL, data=schedule_data)

File "/usr/local/lib/python2.7/dist-packages/requests/api.py" in post

  1. return request('post', url, data=data, json=json, **kwargs)

File "/usr/local/lib/python2.7/dist-packages/requests/api.py" in request

  1. return session.request(method=method, url=url, **kwargs)

File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py" in request

  1. resp = self.send(prep, **send_kwargs)

File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py" in send

  1. r = adapter.send(request, **kwargs)

File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py" in send

  1. raise ConnectionError(e, request=request)

Exception Type: ConnectionError at /api/projects/toscrape/spiders/books.toscrape.com/schedule Exception Value: HTTPConnectionPool(host='localhost', port=6800): Max retries exceeded with url: /schedule.json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb406f17ad0>: Failed to establish a new connection: [Errno 111] Connection refused',)) Request information: USER: LocalUser(root)

GET: No GET data

POST: No POST data

FILES: No FILES data

COOKIES: No cookie data

META: CONTENT_LENGTH = '53' CONTENT_TYPE = 'application/json; charset=UTF-8' DJANGO_SETTINGS_MODULE = 'portia_server.settings' GATEWAY_INTERFACE = 'CGI/1.1' HOME = '/root' HOSTNAME = 'portia-web-2246925667-7rj4c' HTTP_ACCEPT = '/' HTTP_ACCEPT_ENCODING = 'gzip, deflate' HTTP_ACCEPT_LANGUAGE = 'en-US,en;q=0.9,fa;q=0.8,it;q=0.7' HTTP_CONNECTION = 'close' HTTP_HOST = '35.205.230.83' HTTP_ORIGIN = 'http://35.205.230.83' HTTP_REFERER = 'http://35.205.230.83/' HTTP_USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36' HTTP_X_FORWARDED_FOR = '10.132.0.8' HTTP_X_REAL_IP = '10.132.0.8' HTTP_X_REQUESTED_WITH = 'XMLHttpRequest' KUBERNETES_PORT = 'tcp://10.11.240.1:443' KUBERNETES_PORT_443_TCP = 'tcp://10.11.240.1:443' KUBERNETES_PORT_443_TCP_ADDR = '10.11.240.1' KUBERNETES_PORT_443_TCP_PORT = '443' KUBERNETES_PORT_443_TCP_PROTO = 'tcp' KUBERNETES_SERVICE_HOST = '10.11.240.1' KUBERNETES_SERVICE_PORT = '443' KUBERNETES_SERVICE_PORT_HTTPS = '443' PATH = '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' PATH_INFO = u'/api/projects/toscrape/spiders/books.toscrape.com/schedule' PWD = '/app/slyd' PYTHONPATH = '/app/portia_server:/app/slyd:/app/slybot' QUERY_STRING = '' REMOTE_ADDR = '127.0.0.1' REMOTE_HOST = '' REQUEST_METHOD = 'POST' RUN_MAIN = 'true' SCRIPT_NAME = u'' SERVER_NAME = 'localhost' SERVER_PORT = '8000' SERVER_PROTOCOL = 'HTTP/1.0' SERVERSOFTWARE = 'WSGIServer/0.1 Python/2.7.6' SHLVL = '1' TZ = 'UTC' = '/app/portia_server/manage.py' wsgi.errors = <open file '', mode 'w' at 0x7fb4227021e0> wsgi.file_wrapper = '' wsgi.input = <socket._fileobject object at 0x7fb415438b50> wsgi.multiprocess = False wsgi.multithread = True wsgi.run_once = False wsgi.url_scheme = 'http' wsgi.version =

Settings: Using settings module portia_server.settings ABSOLUTE_URL_OVERRIDES = {} ADMINS = [] ALLOWED_HOSTS = ['*'] APPEND_SLASH = True AUTHENTICATION_BACKENDS = [u'django.contrib.auth.backends.ModelBackend'] AUTH_PASSWORD_VALIDATORS = u'****' AUTH_USER_MODEL = u'auth.User' BASE_DIR = '/app/portia_server' BLACKLIST_URLS = set([]) CACHES = {u'default': {u'BACKEND': u'django.core.cache.backends.locmem.LocMemCache'}} CACHE_MIDDLEWARE_ALIAS = u'default' CACHE_MIDDLEWARE_KEY_PREFIX = u'****' CACHE_MIDDLEWARE_SECONDS = 600 CAPABILITIES = {'deploy_projects': False, 'delete_projects': True, 'create_projects': True, 'rename_templates': True, 'rename_projects': True, 'version_control': False, 'rename_spiders': True} CSRF_COOKIE_AGE = 31449600 CSRF_COOKIE_DOMAIN = None CSRF_COOKIE_HTTPONLY = False CSRF_COOKIE_NAME = u'csrftoken' CSRF_COOKIE_PATH = u'/' CSRF_COOKIE_SECURE = False CSRF_FAILURE_VIEW = u'django.views.csrf.csrf_failure' CSRF_HEADER_NAME = u'HTTP_X_CSRFTOKEN' CSRF_TRUSTED_ORIGINS = [] CUSTOM = {} DATABASES = {'default': {'ENGINE': 'django.db.backends.sqlite3', 'AUTOCOMMIT': True, 'ATOMIC_REQUESTS': False, 'NAME': '/app/portia_server/db.sqlite3', 'CONN_MAX_AGE': 0, 'TIME_ZONE': None, 'OPTIONS': {}, 'HOST': '', 'USER': '', 'TEST': {'COLLATION': None, 'CHARSET': None, 'NAME': None, 'MIRROR': None}, 'PASSWORD': u'****', 'PORT': ''}} DATABASE_ROUTERS = [] DATA_UPLOAD_MAX_MEMORY_SIZE = 2621440 DATA_UPLOAD_MAX_NUMBER_FIELDS = 1000 DATETIME_FORMAT = u'N j, Y, P' DATETIME_INPUT_FORMATS = [u'%Y-%m-%d %H:%M:%S', u'%Y-%m-%d %H:%M:%S.%f', u'%Y-%m-%d %H:%M', u'%Y-%m-%d', u'%m/%d/%Y %H:%M:%S', u'%m/%d/%Y %H:%M:%S.%f', u'%m/%d/%Y %H:%M', u'%m/%d/%Y', u'%m/%d/%y %H:%M:%S', u'%m/%d/%y %H:%M:%S.%f', u'%m/%d/%y %H:%M', u'%m/%d/%y'] DATE_FORMAT = u'N j, Y' DATE_INPUT_FORMATS = [u'%Y-%m-%d', u'%m/%d/%Y', u'%m/%d/%y', u'%b %d %Y', u'%b %d, %Y', u'%d %b %Y', u'%d %b, %Y', u'%B %d %Y', u'%B %d, %Y', u'%d %B %Y', u'%d %B, %Y'] DEBUG = True DEBUG_PROPAGATE_EXCEPTIONS = False DECIMAL_SEPARATOR = u'.' DEFAULT_CHARSET = u'utf-8' DEFAULT_CONTENT_TYPE = u'text/html' DEFAULT_EXCEPTION_REPORTER_FILTER = u'django.views.debug.SafeExceptionReporterFilter' DEFAULT_FILE_STORAGE = u'django.core.files.storage.FileSystemStorage' DEFAULT_FROM_EMAIL = u'webmaster@localhost' DEFAULT_INDEX_TABLESPACE = u'' DEFAULT_TABLESPACE = u'' DISALLOWED_USER_AGENTS = [] EMAIL_BACKEND = u'django.core.mail.backends.smtp.EmailBackend' EMAIL_HOST = u'localhost' EMAIL_HOST_PASSWORD = u'****' EMAIL_HOST_USER = u'' EMAIL_PORT = 25 EMAIL_SSL_CERTFILE = None EMAIL_SSL_KEYFILE = u'****' EMAIL_SUBJECT_PREFIX = u'[Django] ' EMAIL_TIMEOUT = None EMAIL_USE_SSL = False EMAIL_USE_TLS = False FILE_CHARSET = u'utf-8' FILE_UPLOAD_DIRECTORY_PERMISSIONS = None FILE_UPLOAD_HANDLERS = [u'django.core.files.uploadhandler.MemoryFileUploadHandler', u'django.core.files.uploadhandler.TemporaryFileUploadHandler'] FILE_UPLOAD_MAX_MEMORY_SIZE = 2621440 FILE_UPLOAD_PERMISSIONS = None FILE_UPLOAD_TEMP_DIR = None FIRST_DAY_OF_WEEK = 0 FIXTURE_DIRS = [] FORCE_SCRIPT_NAME = None FORMAT_MODULE_PATH = None IGNORABLE_404_URLS = [] INSTALLED_APPS = ['db_repo.apps.DbRepoConfig', 'storage.apps.StorageConfig', 'portia_orm.apps.PortiaOrmConfig', 'portia_api.apps.PortiaApiConfig', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'whitenoise.runserver_nostatic', 'django.contrib.staticfiles'] INTERNAL_IPS = [] LANGUAGES = [(u'af', u'Afrikaans'), (u'ar', u'Arabic'), (u'ast', u'Asturian'), (u'az', u'Azerbaijani'), (u'bg', u'Bulgarian'), (u'be', u'Belarusian'), (u'bn', u'Bengali'), (u'br', u'Breton'), (u'bs', u'Bosnian'), (u'ca', u'Catalan'), (u'cs', u'Czech'), (u'cy', u'Welsh'), (u'da', u'Danish'), (u'de', u'German'), (u'dsb', u'Lower Sorbian'), (u'el', u'Greek'), (u'en', u'English'), (u'en-au', u'Australian English'), (u'en-gb', u'British English'), (u'eo', u'Esperanto'), (u'es', u'Spanish'), (u'es-ar', u'Argentinian Spanish'), (u'es-co', u'Colombian Spanish'), (u'es-mx', u'Mexican Spanish'), (u'es-ni', u'Nicaraguan Spanish'), (u'es-ve', u'Venezuelan Spanish'), (u'et', u'Estonian'), (u'eu', u'Basque'), (u'fa', u'Persian'), (u'fi', u'Finnish'), (u'fr', u'French'), (u'fy', u'Frisian'), (u'ga', u'Irish'), (u'gd', u'Scottish Gaelic'), (u'gl', u'Galician'), (u'he', u'Hebrew'), (u'hi', u'Hindi'), (u'hr', u'Croatian'), (u'hsb', u'Upper Sorbian'), (u'hu', u'Hungarian'), (u'ia', u'Interlingua'), (u'id', u'Indonesian'), (u'io', u'Ido'), (u'is', u'Icelandic'), (u'it', u'Italian'), (u'ja', u'Japanese'), (u'ka', u'Georgian'), (u'kk', u'Kazakh'), (u'km', u'Khmer'), (u'kn', u'Kannada'), (u'ko', u'Korean'), (u'lb', u'Luxembourgish'), (u'lt', u'Lithuanian'), (u'lv', u'Latvian'), (u'mk', u'Macedonian'), (u'ml', u'Malayalam'), (u'mn', u'Mongolian'), (u'mr', u'Marathi'), (u'my', u'Burmese'), (u'nb', u'Norwegian Bokm\xe5l'), (u'ne', u'Nepali'), (u'nl', u'Dutch'), (u'nn', u'Norwegian Nynorsk'), (u'os', u'Ossetic'), (u'pa', u'Punjabi'), (u'pl', u'Polish'), (u'pt', u'Portuguese'), (u'pt-br', u'Brazilian Portuguese'), (u'ro', u'Romanian'), (u'ru', u'Russian'), (u'sk', u'Slovak'), (u'sl', u'Slovenian'), (u'sq', u'Albanian'), (u'sr', u'Serbian'), (u'sr-latn', u'Serbian Latin'), (u'sv', u'Swedish'), (u'sw', u'Swahili'), (u'ta', u'Tamil'), (u'te', u'Telugu'), (u'th', u'Thai'), (u'tr', u'Turkish'), (u'tt', u'Tatar'), (u'udm', u'Udmurt'), (u'uk', u'Ukrainian'), (u'ur', u'Urdu'), (u'vi', u'Vietnamese'), (u'zh-hans', u'Simplified Chinese'), (u'zh-hant', u'Traditional Chinese')] LANGUAGES_BIDI = [u'he', u'ar', u'fa', u'ur'] LANGUAGE_CODE = 'en-us' LANGUAGE_COOKIE_AGE = None LANGUAGE_COOKIE_DOMAIN = None LANGUAGE_COOKIE_NAME = u'django_language' LANGUAGE_COOKIE_PATH = u'/' LOCALE_PATHS = [] LOGGING = {} LOGGING_CONFIG = u'logging.config.dictConfig' LOGIN_REDIRECT_URL = u'/accounts/profile/' LOGIN_URL = u'/accounts/login/' LOGOUT_REDIRECT_URL = None MANAGERS = [] MEDIA_ROOT = '/app/data/projects' MEDIA_URL = u'' MESSAGE_STORAGE = u'django.contrib.messages.storage.fallback.FallbackStorage' MIDDLEWARE = ['django.middleware.security.SecurityMiddleware', 'whitenoise.middleware.WhiteNoiseMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.middleware.clickjacking.XFrameOptionsMiddleware', 'portia_orm.middleware.ORMDataStoreMiddleware'] MIDDLEWARE_CLASSES = [u'django.middleware.common.CommonMiddleware', u'django.middleware.csrf.CsrfViewMiddleware'] MIGRATION_MODULES = {} MONTH_DAY_FORMAT = u'F j' NUMBER_GROUPING = 0 PASSWORD_HASHERS = u'****' PASSWORD_RESET_TIMEOUT_DAYS = u'****' PORTIA_STORAGE_BACKEND = 'storage.backends.FsStorage' PREPEND_WWW = False REST_FRAMEWORK = {'DEFAULT_AUTHENTICATION_CLASSES': ('portia_server.backends.LocalAuthentication',), 'URL_FORMAT_OVERRIDE': None, 'EXCEPTION_HANDLER': 'portia_api.jsonapi.exceptions.jsonapi_exception_handler'} ROOT_URLCONF = 'portia_server.urls' SCHEDULE_URL = 'http://localhost:6800/schedule.json' SECRET_KEY = u'****' SECURE_BROWSER_XSS_FILTER = False SECURE_CONTENT_TYPE_NOSNIFF = False SECURE_HSTS_INCLUDE_SUBDOMAINS = False SECURE_HSTS_SECONDS = 0 SECURE_PROXY_SSL_HEADER = None SECURE_REDIRECT_EXEMPT = [] SECURE_SSL_HOST = None SECURE_SSL_REDIRECT = False SERVER_EMAIL = u'root@localhost' SESSION_CACHE_ALIAS = u'default' SESSION_COOKIE_AGE = 1209600 SESSION_COOKIE_DOMAIN = None SESSION_COOKIE_HTTPONLY = True SESSION_COOKIE_NAME = u'sessionid' SESSION_COOKIE_PATH = u'/' SESSION_COOKIE_SECURE = False SESSION_ENGINE = u'django.contrib.sessions.backends.db' SESSION_EXPIRE_AT_BROWSER_CLOSE = False SESSION_FILE_PATH = None SESSION_SAVE_EVERY_REQUEST = False SESSION_SERIALIZER = u'django.contrib.sessions.serializers.JSONSerializer' SETTINGS_MODULE = 'portia_server.settings' SHORT_DATETIME_FORMAT = u'm/d/Y P' SHORT_DATE_FORMAT = u'm/d/Y' SIGNING_BACKEND = u'django.core.signing.TimestampSigner' SILENCED_SYSTEM_CHECKS = [] STATICFILES_DIRS = [] STATICFILES_FINDERS = [u'django.contrib.staticfiles.finders.FileSystemFinder', u'django.contrib.staticfiles.finders.AppDirectoriesFinder'] STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage' STATIC_ROOT = '/app/portiaui/dist' STATIC_URL = '/' TEMPLATES = [] TEST_NON_SERIALIZED_APPS = [] TEST_RUNNER = u'django.test.runner.DiscoverRunner' THOUSAND_SEPARATOR = u',' TIME_FORMAT = u'P' TIME_INPUT_FORMATS = [u'%H:%M:%S', u'%H:%M:%S.%f', u'%H:%M'] TIME_ZONE = 'UTC' USE_ETAGS = False USE_I18N = True USE_L10N = True USE_THOUSAND_SEPARATOR = False USE_TZ = True USE_X_FORWARDED_HOST = False USE_X_FORWARDED_PORT = False WSGI_APPLICATION = 'portia_server.wsgi.application' X_FRAME_OPTIONS = u'SAMEORIGIN' YEAR_MONTH_FORMAT = u'F Y'

You're seeing this error because you have DEBUG = True in your Django settings file. Change that to False, and Django will display a standard page generated by the handler for this status code. `

zhengrp commented 3 years ago

docker 环境部署,运行爬虫会有这个问题