wummel / linkchecker

check links in web documents or full websites
http://wummel.github.io/linkchecker/
GNU General Public License v2.0
1.42k stars 234 forks source link

Opening bug report as requested #626

Open j75 opened 8 years ago

j75 commented 8 years ago

DEBUG 2015-11-26 10:28:47,763 MainThread Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2 DEBUG 2015-11-26 10:28:47,764 MainThread reading configuration from ['/home/mion/.linkchecker/linkcheckerrc'] INFO 2015-11-26 10:28:47,773 MainThread Checking intern URLs only; use --check-extern to check extern URLs. DEBUG 2015-11-26 10:28:47,782 MainThread configuration: [('aborttimeout', 300), ('allowedschemes', []), ('authentication', []), ('blacklist', {}), ('checkextern', False), ('cookiefile', None), ('csv', {}), ('debugmemory', False), ('dot', {'encoding': 'utf-8'}), ('enabledplugins', []), ('externlinks', []), ('fileoutput', []), ('gml', {}), ('gxml', {'encoding': 'utf-8'}), ('html', {}), ('ignorewarnings', []), ('internlinks', []), ('localwebroot', None), ('logger', 'TextLogger'), ('loginextrafields', {}), ('loginpasswordfield', 'password'), ('loginurl', None), ('loginuserfield', 'login'), ('maxfilesizedownload', 5242880), ('maxfilesizeparse', 1048576), ('maxhttpredirects', 10), ('maxnumurls', None), ('maxrequestspersecond', 10), ('maxrunseconds', None), ('nntpserver', None), ('none', {}), ('output', 'text'), ('pluginfolders', []), ('proxy', {}), ('quiet', False), ('recursionlevel', -1), ('sitemap', {}), ('sql', {}), ('sslverify', True), ('status', True), ('status_wait_seconds', 5), ('text', {}), ('threads', 10), ('timeout', 60), ('trace', False), ('useragent', u'Mozilla/5.0 (compatible; LinkChecker/9.3; +http://wummel.github.io/linkchecker/)'), ('verbose', False), ('warnings', True), ('xml', {'encoding': 'utf-8'})] DEBUG 2015-11-26 10:28:47,782 MainThread HttpUrl handles url https://cubie.famion.eu/%7Emarian/index.html DEBUG 2015-11-26 10:28:47,783 MainThread checking syntax DEBUG 2015-11-26 10:28:47,783 MainThread Add intern pattern u'^https?://(www.|)cubie.famion.eu\/\%7Emarian' DEBUG 2015-11-26 10:28:47,783 MainThread Link pattern u'^https?://(www.|)cubie.famion.eu\/\%7Emarian' strict=False DEBUG 2015-11-26 10:28:47,783 MainThread queueing https://cubie.famion.eu/~marian/index.html LinkChecker 9.3 Copyright (C) 2000-2014 Bastian Kleineidam LinkChecker comes with ABSOLUTELY NO WARRANTY! This is free software, and you are welcome to redistribute it under certain conditions. Look at the file `LICENSE' within this distribution. Get the newest version at http://wummel.github.io/linkchecker/ Write comments and bugs to https://github.com/wummel/linkchecker/issues Support this project at http://wummel.github.io/linkchecker/donations.html

Start checking at 2015-11-26 10:28:47+002 DEBUG 2015-11-26 10:28:47,789 CheckThread-https://cubie.famion.eu/~marian/index.html Checking https link base_url=u'https://cubie.famion.eu/%7Emarian/index.html' parent_url=None base_ref=None recursion_level=0 url_connection=None line=0 column=0 page=0 name=u'' anchor=u'' cache_url=https://cubie.famion.eu/~marian/index.html DEBUG 2015-11-26 10:28:47,790 CheckThread-https://cubie.famion.eu/~marian/index.html checking connection DEBUG 2015-11-26 10:28:48,004 CheckThread-https://cubie.famion.eu/~marian/index.html u'https://cubie.famion.eu/robots.txt' parse lines DEBUG 2015-11-26 10:28:48,004 CheckThread-https://cubie.famion.eu/~marian/index.html Parsed rules: User-agent: W3C-checklink Allow: /

User-agent: htdig Disallow: /cgi-bin/

User-agent: Mozilla/4.0 (compatible: FDSE robot) Disallow: /cgi-bin/

User-agent: wget4-xapian-omega Disallow: /cgi-bin/

User-agent: Mozilla/5.0 (compatible; LinkChecker/9.3; +http://wummel.github.io/linkchecker/ Disallow: /cgi-bin/

User-agent: * Disallow: / DEBUG 2015-11-26 10:28:48,004 CheckThread-https://cubie.famion.eu/~marian/index.html u'https://cubie.famion.eu/robots.txt' check allowance for: user agent: u'Mozilla/5.0 (compatible; LinkChecker/9.3; +http://wummel.github.io/linkchecker/)' url: u'https://cubie.famion.eu/~marian/index.html' ... DEBUG 2015-11-26 10:28:48,005 CheckThread-https://cubie.famion.eu/~marian/index.html /%7Emarian/index.html Disallow: /cgi-bin/ False DEBUG 2015-11-26 10:28:48,005 CheckThread-https://cubie.famion.eu/~marian/index.html ... no rule lines of ['Mozilla/5.0 (compatible; LinkChecker/9.3; +http://wummel.github.io/linkchecker/'] applied to /%7Emarian/index.html; allowed. DEBUG 2015-11-26 10:28:48,005 CheckThread-https://cubie.famion.eu/~marian/index.html Prepare request with {'headers': {}, 'method': 'GET', 'url': u'https://cubie.famion.eu/~marian/index.html'} DEBUG 2015-11-26 10:28:48,006 CheckThread-https://cubie.famion.eu/~marian/index.html Send request with {'verify': True, 'stream': True, 'allow_redirects': False, 'timeout': 60} DEBUG 2015-11-26 10:28:48,021 CheckThread-https://cubie.famion.eu/~marian/index.html task_done https://cubie.famion.eu/~marian/index.html

****** Oops, I did it again. *****

You have found an internal error in LinkChecker. Please write a bug report at https://github.com/wummel/linkchecker/issues and include the following information:

When using the commandline client:

Not disclosing some of the information above due to privacy reasons is ok. I will try to help you nonetheless, but you have to give me something I can work with ;) .

Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/linkcheck/director/checker.py", line 104, in check_url line: self.check_url_data(url_data) locals: self = <Checker(CheckThread-https://cubie.famion.eu/~marian/index.html, started 139954421192448)> self.check_url_data = <bound method Checker.check_url_data of <Checker(CheckThread-https://cubie.famion.eu/~marian/index.html, started 139954421192448)>> url_data = <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html> File "/usr/lib/python2.7/dist-packages/linkcheck/director/checker.py", line 120, in check_url_data line: check_url(url_data, self.logger) locals: check_url = <function check_url at 0x7f49b156a848> url_data = <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html> self = <Checker(CheckThread-https://cubie.famion.eu/~marian/index.html, started 139954421192448)> self.logger = <linkcheck.director.logger.Logger object at 0x7f49ae64cc50> File "/usr/lib/python2.7/dist-packages/linkcheck/director/checker.py", line 52, in check_url line: url_data.check() locals: url_data = <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html> url_data.check = <bound method HttpUrl.check of <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html>> File "/usr/lib/python2.7/dist-packages/linkcheck/checker/urlbase.py", line 424, in check line: self.local_check() locals: self = <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html> self.local_check = <bound method HttpUrl.local_check of <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html>> File "/usr/lib/python2.7/dist-packages/linkcheck/checker/urlbase.py", line 442, in local_check line: self.check_connection() locals: self = <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html> self.check_connection = <bound method HttpUrl.check_connection of <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html>> File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 135, in check_connection line: self.send_request(request) locals: self = <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html> self.send_request = <bound method HttpUrl.send_request of <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html>> request = <PreparedRequest [GET]> File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 165, in send_request line: self._send_request(request, **kwargs) locals: self = <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html> self._send_request = <bound method HttpUrl._send_request of <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html>> request = <PreparedRequest [GET]> kwargs = {'verify': True, 'timeout': 60, 'allow_redirects': False, 'stream': True} File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 172, in _send_request line: self._add_ssl_info() locals: self = <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html> self._add_ssl_info = <bound method HttpUrl._add_ssl_info of <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html>> File "/usr/lib/python2.7/dist-packages/linkcheck/checker/httpurl.py", line 199, in _add_ssl_info line: self.ssl_cert = httputil.x509_to_dict(cert) locals: self = <https link, base_url=u'https://cubie.famion.eu/%7Emarian/index.html', parent_url=None, base_ref=None, recursion_level=0, url_connection=None, line=0, column=0, page=0, name=u'', anchor=u'', cache_url=https://cubie.famion.eu/~marian/index.html> self.ssl_cert = None httputil = <module 'linkcheck.httputil' from '/usr/lib/python2.7/dist-packages/linkcheck/httputil.pyc'> httputil.x509_to_dict = <function x509_to_dict at 0x7f49b1dc3b18> cert = <OpenSSL.crypto.X509 object at 0x7f49b4c5bbd0> File "/usr/lib/python2.7/dist-packages/linkcheck/httputil.py", line 35, in x509_to_dict line: from requests.packages.urllib3.contrib.pyopenssl import get_subj_alt_name locals: requests = requests.packages = requests.packages.urllib3 = requests.packages.urllib3.contrib = requests.packages.urllib3.contrib.pyopenssl = get_subj_alt_name = ImportError: No module named packages.urllib3.contrib.pyopenssl System info: LinkChecker 9.3 Released on: 16.7.2014 Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2 Requests: 2.2.1 Qt: 4.8.6 / PyQt: 4.10.4 Modules: QScintilla, Sqlite, Gconf Local time: 2015-11-26 10:28:48+002 sys.argv: ['/usr/bin/linkchecker', '-Dall', 'https://cubie.famion.eu/%7Emarian/index.html'] LANGUAGE = 'en_GB.UTF-8' LANG = 'fr_FR.UTF-8' Default locale: ('fr', 'UTF-8')

\ LinkChecker internal error, over and out ** WARNING 2015-11-26 10:28:48,055 CheckThread-https://cubie.famion.eu/~marian/index.html internal error occurred

Statistics: Downloaded: 0B. No statistics available since no URLs were checked.

That's it. 0 links in 0 URLs checked. 0 warnings found. 0 errors found. There was 1 internal error. Stopped checking at 2015-11-26 10:28:48+002 (0.28 seconds)

anarcat commented 8 years ago

this is fixed in #656

dpalic commented 7 years ago

Thank you for the issue report. Sadly this project is dead, and a new team is around with https://github.com/linkcheck/linkchecker for more details please see: #708 Also please close this issue and report it freshly on the new repo https://github.com/linkcheck/linkchecker/issues if your issue still persists