devinit / DIwebsite-redesign

New DI website 2019
1 stars 1 forks source link

Current broken link checker is broken #1125

Open SimonMurphyDI opened 2 years ago

SimonMurphyDI commented 2 years ago

I'm submitting a ...


[x] Bug report 
[ ] Regression (behaviour that used to work and stopped working in a new release)

Describe the Issue Alex M has mentioned that the current link checker we use for the DI site is not functional. https://devinitorg.slack.com/archives/CCG8Y0YJG/p1644411796850849?thread_ts=1644337467.066159&cid=CCG8Y0YJG

To Reproduce

  1. Go to 'settings' in the CMS
  2. Click on 'Link checker'
  3. The recent scans find no broken links. Alex said we need a updated version of this.

A working link checker would be wonderful!

edwinmp commented 2 years ago

@davidebukali this is still breaking on live, so I'm moving it to the backlog for the next sprint - error below

Internal Server Error: /admin/link-checker/scan/

NotAllowed at /admin/link-checker/scan/
Connection.open: (530) NOT_ALLOWED - vhost myvhost not found

Request Method: GET
Request URL: https://devinit.org/admin/link-checker/scan/
Django Version: 3.2.13
Python Executable: /usr/bin/python3.7
Python Version: 3.7.3
Python Path: ['/code', '/code', '/code', '/usr/bin', '/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/usr/lib/python3.7/site-packages']
Server time: Wed, 11 May 2022 11:31:35 +0000
Installed Applications:
['di_website.footnotes',
'di_website.home',
'di_website.users',
'di_website.search',
'di_website.ourteam',
'di_website.common',
'di_website.vacancies',
'di_website.blog',
'di_website.news',
'di_website.events',
'di_website.place',
'di_website.contactus',
'di_website.about',
'di_website.general',
'di_website.project',
'di_website.whatwedo',
'di_website.publications',
'di_website.downloads',
'di_website.workforus',
'di_website.datasection',
'di_website.api',
'di_website.spotlight',
'di_website.visualisation',
'di_website.dashboard',
'wagtail.contrib.forms',
'wagtail.contrib.redirects',
'wagtail.contrib.settings',
'wagtail.contrib.styleguide',
'wagtail.contrib.table_block',
'wagtail.contrib.search_promotions',
'wagtail.contrib.routable_page',
'wagtail.embeds',
'wagtail.sites',
'wagtail.users',
'wagtail.snippets',
'wagtail.documents',
'wagtail.images',
'wagtail.search',
'wagtail.admin',
'wagtail.core',
'wagtaillinkchecker',
'wagtailgeowidget',
'wagtailmedia',
'modelcluster',
'taggit',
'wagtailfontawesome',
'widget_tweaks',
'wagtailmetadata',
'django_google_optimize',
'django.contrib.sitemaps',
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'whitenoise.runserver_nostatic',
'django.contrib.staticfiles']
Installed Middleware:
['whitenoise.middleware.WhiteNoiseMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
'django.middleware.security.SecurityMiddleware',
'wagtail.contrib.redirects.middleware.RedirectMiddleware',
'django_google_optimize.middleware.google_optimize',
'di_website.custom_middleware.NullInjectionMiddleware']

Traceback (most recent call last):
 File "/usr/lib/python3.7/site-packages/django/core/handlers/exception.py", line 47, in inner
   response = get_response(request)
 File "/usr/lib/python3.7/site-packages/django/core/handlers/base.py", line 181, in _get_response
   response = wrapped_callback(request, *callback_args, **callback_kwargs)
 File "/usr/lib/python3.7/site-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func
   response = view_func(request, *args, **kwargs)
 File "/usr/lib/python3.7/site-packages/wagtail/admin/urls/__init__.py", line 125, in wrapper
   return view_func(request, *args, **kwargs)
 File "/usr/lib/python3.7/site-packages/wagtail/admin/auth.py", line 174, in decorated_view
   response = view_func(request, *args, **kwargs)
 File "/usr/lib/python3.7/site-packages/wagtaillinkchecker/views.py", line 113, in run_scan
   celery_status = get_celery_worker_status()
 File "/usr/lib/python3.7/site-packages/wagtaillinkchecker/scanner.py", line 16, in get_celery_worker_status
   d = insp.stats()
 File "/usr/lib/python3.7/site-packages/celery/app/control.py", line 128, in stats
   return self._request('stats')
 File "/usr/lib/python3.7/site-packages/celery/app/control.py", line 106, in _request
   pattern=self.pattern, matcher=self.matcher,
 File "/usr/lib/python3.7/site-packages/celery/app/control.py", line 480, in broadcast
   limit, callback, channel=channel,
 File "/usr/lib/python3.7/site-packages/kombu/pidbox.py", line 333, in _broadcast
   chan = channel or self.connection.default_channel
 File "/usr/lib/python3.7/site-packages/kombu/connection.py", line 892, in default_channel
   self._ensure_connection(**conn_opts)
 File "/usr/lib/python3.7/site-packages/kombu/connection.py", line 445, in _ensure_connection
   callback, timeout=timeout
 File "/usr/lib/python3.7/site-packages/kombu/utils/functional.py", line 344, in retry_over_time
   return fun(*args, **kwargs)
 File "/usr/lib/python3.7/site-packages/kombu/connection.py", line 874, in _connection_factory
   self._connection = self._establish_connection()
 File "/usr/lib/python3.7/site-packages/kombu/connection.py", line 809, in _establish_connection
   conn = self.transport.establish_connection()
 File "/usr/lib/python3.7/site-packages/kombu/transport/pyamqp.py", line 130, in establish_connection
   conn.connect()
 File "/usr/lib/python3.7/site-packages/amqp/connection.py", line 320, in connect
   self.drain_events(timeout=self.connect_timeout)
 File "/usr/lib/python3.7/site-packages/amqp/connection.py", line 508, in drain_events
   while not self.blocking_read(timeout):
 File "/usr/lib/python3.7/site-packages/amqp/connection.py", line 514, in blocking_read
   return self.on_inbound_frame(frame)
 File "/usr/lib/python3.7/site-packages/amqp/method_framing.py", line 55, in on_frame
   callback(channel, method_sig, buf, None)
 File "/usr/lib/python3.7/site-packages/amqp/connection.py", line 521, in on_inbound_method
   method_sig, payload, content,
 File "/usr/lib/python3.7/site-packages/amqp/abstract_channel.py", line 145, in dispatch_method
   listener(*args)
 File "/usr/lib/python3.7/site-packages/amqp/connection.py", line 651, in _on_close
   (class_id, method_id), ConnectionError)

Exception Type: NotAllowed at /admin/link-checker/scan/
Exception Value: Connection.open: (530) NOT_ALLOWED - vhost myvhost not found
Request information:
USER: [edwin.magezi@devinit.org](mailto:edwin.magezi@devinit.org)
SimonMurphyDI commented 2 years ago

@davidebukali suggest we also look at adding this to the GNR site when it's ready

edwinmp commented 2 years ago

@SimonMurphyDI this seems to be fixed on both staging and live ... could you please verify that it functions as expected.

SimonMurphyDI commented 1 year ago

@edwinmp is the link checker on the site now functioning? Just tried to run it and found 0 links were crawled. I know there might not be budget to fix it, but if there's an external site we can use to check the DI site, that would be great

image

edwinmp commented 1 year ago

The link checker isn't being maintained and hasn't been updated in a long time. Since we're quite attached to it, I suggest we contribute to it and get it updated. Otherwise it'll be abandoned. The fact that no one has raised issues on it in a while tells me that not many people use it.

SimonMurphyDI commented 1 year ago

Thanks @edwinmp

Agree this is either worth doing, or getting an external programme that has the same function. I think we would use it regularly if we could. Happy to have a hunt for other software/sites, but didn't want to do that if you had plans to revamp the existing.

Best Simon

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

SimonMurphyDI commented 1 year ago

Hi @edwinmp @akmiller01 It looks like the free version of this software provides a link checker. Are you OK if I download it? https://ahrefs.com/webmaster-tools

akmiller01 commented 1 year ago

Hi @SimonMurphyDI , as long as it's an external tool, and not a dependency we'd need to install in the stack of the website, it looks like a good service to me. They'll likely ask you to upload an TXT record to the DNS for the website to verify ownership. Let me know if they ask for something like that over Slack and I can set it up for you.

edwinmp commented 1 year ago

Hi @edwinmp @akmiller01 It looks like the free version of this software provides a link checker. Are you OK if I download it? https://ahrefs.com/webmaster-tools

@SimonMurphyDI how is this option working out for you? If it's good and you're keeping it, please feel free to close this ticket (though would still love to know about the tool).