Open TheDromundKaas opened 9 years ago
this is fixed in #656
Thank you for the issue report. Sadly this project is dead, and a new team is around with https://github.com/linkcheck/linkchecker for more details please see: #708 Also please close this issue and report it freshly on the new repo https://github.com/linkcheck/linkchecker/issues if your issue still persists
root@j185599:~# linkchecker https://www.haneu.de -Dall DEBUG 2015-10-22 12:19:46,672 MainThread Python 2.7.9 (default, Mar 1 2015, 12:57:24) [GCC 4.9.2] on linux2 DEBUG 2015-10-22 12:19:46,672 MainThread reading configuration from ['/root/.linkchecker/linkcheckerrc'] WARNING 2015-10-22 12:19:46,684 MainThread Running as root user; dropping privileges by changing user to nobody. INFO 2015-10-22 12:19:46,685 MainThread Checking intern URLs only; use --check-extern to check extern URLs. DEBUG 2015-10-22 12:19:46,691 MainThread configuration: [('aborttimeout', 300), ('allowedschemes', []), ('authentication', []), ('blacklist', {}), ('checkextern', False), ('cookiefile', None), ('csv', {}), ('debugmemory', False), ('dot', {}), ('enabledplugins', []), ('externlinks', []), ('fileoutput', []), ('gml', {}), ('gxml', {}), ('html', {}), ('ignorewarnings', []), ('internlinks', []), ('localwebroot', None), ('logger', 'TextLogger'), ('loginextrafields', {}), ('loginpasswordfield', 'password'), ('loginurl', None), ('loginuserfield', 'login'), ('maxfilesizedownload', 5242880), ('maxfilesizeparse', 1048576), ('maxhttpredirects', 10), ('maxnumurls', None), ('maxrequestspersecond', 10), ('maxrunseconds', None), ('nntpserver', None), ('none', {}), ('output', 'text'), ('pluginfolders', []), ('proxy', {}), ('quiet', False), ('recursionlevel', -1), ('sitemap', {}), ('sql', {}), ('sslverify', True), ('status', True), ('status_wait_seconds', 5), ('text', {}), ('threads', 10), ('timeout', 60), ('trace', False), ('useragent', u'Mozilla/5.0 (compatible; LinkChecker/9.3; +http://wummel.github.io/linkchecker/)'), ('verbose', False), ('warnings', True), ('xml', {})] DEBUG 2015-10-22 12:19:46,693 MainThread HttpUrl handles url https://www.haneu.de DEBUG 2015-10-22 12:19:46,693 MainThread checking syntax DEBUG 2015-10-22 12:19:46,694 MainThread Add intern pattern u'^https?://(www.|)haneu.de' DEBUG 2015-10-22 12:19:46,695 MainThread Link pattern u'^https?://(www.|)haneu.de' strict=False DEBUG 2015-10-22 12:19:46,695 MainThread queueing https://www.haneu.de LinkChecker 9.3 Copyright (C) 2000-2014 Bastian Kleineidam LinkChecker comes with ABSOLUTELY NO WARRANTY! This is free software, and you are welcome to redistribute it under certain conditions. Look at the file `LICENSE' within this distribution. Get the newest version at http://wummel.github.io/linkchecker/ Write comments and bugs to https://github.com/wummel/linkchecker/issues Support this project at http://wummel.github.io/linkchecker/donations.html
Start checking at 2015-10-22 12:19:46+002 DEBUG 2015-10-22 12:19:46,724 CheckThread-https://www.haneu.de Checking https link base_url=u'https://www.haneu.de' parent_url=None base_ref=None recursion_level=0 url_connection=None line=0 column=0 page=0 name=u'' anchor=u'' cache_url=https://www.haneu.de DEBUG 2015-10-22 12:19:46,728 CheckThread-https://www.haneu.de checking connection DEBUG 2015-10-22 12:19:46,952 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' parse lines DEBUG 2015-10-22 12:19:46,953 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 7: allow or disallow directives without any user-agent line DEBUG 2015-10-22 12:19:46,953 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 9: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,953 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 10: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,953 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 11: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,954 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 12: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,954 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 13: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,954 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 14: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,954 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 15: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,954 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 18: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,955 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 19: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,955 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 20: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,955 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 21: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,955 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 22: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,955 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 24: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,956 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 25: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,956 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 26: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,956 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 27: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,956 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 28: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,956 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 29: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,956 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 30: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,957 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 31: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,957 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 32: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,957 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 33: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,957 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 34: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,957 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 37: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,957 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 38: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,958 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 39: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,958 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 40: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,958 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 41: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,958 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 42: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,958 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 43: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,958 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 44: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,958 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' line 50: missing user-agent directive before this line DEBUG 2015-10-22 12:19:46,959 CheckThread-https://www.haneu.de Parsed rules: User-agent: Googlebot-Image Allow: / DEBUG 2015-10-22 12:19:46,959 CheckThread-https://www.haneu.de u'https://www.haneu.de/robots.txt' check allowance for: user agent: u'Mozilla/5.0 (compatible; LinkChecker/9.3; +http://wummel.github.io/linkchecker/)' url: u'https://www.haneu.de' ... DEBUG 2015-10-22 12:19:46,959 CheckThread-https://www.haneu.de ... agent not found, allow. DEBUG 2015-10-22 12:19:46,959 CheckThread-https://www.haneu.de Prepare request with {'method': 'GET', 'url': u'https://www.haneu.de', 'headers': {}} DEBUG 2015-10-22 12:19:46,960 CheckThread-https://www.haneu.de Send request with {'allow_redirects': False, 'timeout': 60, 'verify': True, 'stream': True} DEBUG 2015-10-22 12:19:47,007 CheckThread-https://www.haneu.de task_done https://www.haneu.de
****** Oops, I did it again. *****
You have found an internal error in LinkChecker. Please write a bug report at https://github.com/wummel/linkchecker/issues and include the following information:
When using the commandline client:
Not disclosing some of the information above due to privacy reasons is ok. I will try to help you nonetheless, but you have to give me something I can work with ;) .
Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/linkcheck/director/checker.py", line 104, in check_url line: self.check_url_data(url_data) locals: self = <Checker(CheckThread-https://www.haneu.de, started 140294285195008)>
self.check_url_data = <bound method Checker.check_url_data of <Checker(CheckThread-https://www.haneu.de, started 140294285195008)>>
url_data = <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>
File "/usr/lib/python2.7/dist-packages/linkcheck/director/checker.py", line 120, in check_url_data
line: check_url(url_data, self.logger)
locals:
check_url = <function check_url at 0x7f98d0fc7b90>
url_data = <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>
self = <Checker(CheckThread-https://www.haneu.de, started 140294285195008)>
self.logger = <linkcheck.director.logger.Logger object at 0x7f98d08dce90>
File "/usr/lib/python2.7/dist-packages/linkcheck/director/checker.py", line 52, in check_url
line: url_data.check()
locals:
url_data = <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>
url_data.check = <bound method HttpUrl.check of <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>>
File "/usr/lib/python2.7/dist-packages/linkcheck/checker/urlbase.py", line 424, in check
line: self.local_check()
locals:
self = <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>
self.local_check = <bound method HttpUrl.local_check of <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>>
File "/usr/lib/python2.7/dist-packages/linkcheck/checker/urlbase.py", line 442, in local_check
line: self.check_connection()
locals:
self = <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>
self.check_connection = <bound method HttpUrl.check_connection of <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>>
File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 135, in check_connection
line: self.send_request(request)
locals:
self = <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>
self.send_request = <bound method HttpUrl.send_request of <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>>
request = <PreparedRequest [GET]>
File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 165, in send_request
line: self._send_request(request, **kwargs)
locals:
self = <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>
self._send_request = <bound method HttpUrl._send_request of <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>>
request = <PreparedRequest [GET]>
kwargs = {'allow_redirects': False, 'timeout': 60, 'verify': True, 'stream': True}
File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 172, in _send_request
line: self._add_ssl_info()
locals:
self = <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>
self._add_ssl_info = <bound method HttpUrl._add_ssl_info of <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>>
File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 199, in _add_ssl_info
line: self.ssl_cert = httputil.x509_to_dict(cert)
locals:
self = <https link, base_url=u'https://www.haneu.de', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://www.haneu.de>
self.ssl_cert = None
httputil = <module 'linkcheck.httputil' from '/usr/lib/python2.7/dist-packages/linkcheck/httputil.pyc'>
httputil.x509_to_dict = <function x509_to_dict at 0x7f98d183ee60>
cert = <OpenSSL.crypto.X509 object at 0x7f98ce869950>
File "/usr/lib/python2.7/dist-packages/linkcheck/httputil.py", line 35, in x509_to_dict
line: from requests.packages.urllib3.contrib.pyopenssl import get_subj_alt_name
locals:
requests =
requests.packages =
requests.packages.urllib3 =
requests.packages.urllib3.contrib =
requests.packages.urllib3.contrib.pyopenssl =
get_subj_alt_name =
ImportError: cannot import name get_subj_alt_name
System info:
LinkChecker 9.3
Released on: 16.7.2014
Python 2.7.9 (default, Mar 1 2015, 12:57:24)
[GCC 4.9.2] on linux2
Requests: 2.4.3
Modules: Sqlite
Local time: 2015-10-22 12:19:47+002
sys.argv: ['/usr/bin/linkchecker', 'https://www.haneu.de', '-Dall']
LANG = 'en_US.UTF-8'
Default locale: ('en', 'UTF-8')
\ LinkChecker internal error, over and out ** WARNING 2015-10-22 12:19:47,017 CheckThread-https://www.haneu.de internal error occurred
Statistics: Downloaded: 0B. No statistics available since no URLs were checked.
That's it. 0 links in 0 URLs checked. 0 warnings found. 0 errors found. There was 1 internal error. Stopped checking at 2015-10-22 12:19:47+002 (0.35 seconds) root@j185599:~#