redhat-openstack / easyfix

6 stars 5 forks source link

Broken links on rdoproject.org #24

Open rbowen opened 6 years ago

rbowen commented 6 years ago

Duck reports the results of his webcheck here: https://github.com/redhat-openstack/website/issues/1074

Please fix the broken links. Note that many of these things may no longer be on the website, and the references should be removed, while others have been moved/renamed.

rbowen commented 6 years ago

https://github.com/redhat-openstack/website/pull/1112 begins to address this issue.

mary-grace commented 6 years ago

work continued with https://github.com/redhat-openstack/website/pull/1132

mary-grace commented 6 years ago

this URL is correct: http://www.lupaworld.com/article-223802-1.html perhaps the site was down when the report was run?

http://www.lupaworld.com/article-223802-1.html
[u'error reading HTTP response: timed out']
['https://www.rdoproject.org/blog/page=29/', 'https://www.rdoproject.org/blog/2013/05/rdo-in-the-news/']
mary-grace commented 6 years ago

I can't reproduce this issue:

https://www.rdoproject.org/blog/tag/.html
[u'403: Forbidden']
['https://www.rdoproject.org/blog/2016/02/rdo-blogs-week-of-february-22-2016/', 'https://www.rdoproject.org/blog/author/rleander/', 'https://www.rdoproject.org/blog/author/rbowen/', 'https://www.rdoproject.org/blog/page=14/', 'https://www.rdoproject.org/blog/2016/02/rdo-manager-is-now-tripleo/']

https://www.rdoproject.org/blog/tag/.html is a permission-only site that is automated and included in the footer of the blogpost pages. perhaps it was a timing issue?

mary-grace commented 6 years ago

@rbowen - what happened to the wiki? Many of these links point back to previous pages that lived there, but I'm gathering the wiki is no longer. Do those pages still exist somewhere?

duck-rh commented 6 years ago

@mary-grace if you look at the first URL having the problem (https://www.rdoproject.org/blog/2016/02/rdo-blogs-week-of-february-22-2016/) you will find a relative URL in the list of tags <li><a href="../../../tag/.html"></a></li>. So I guess the webchecker resolves the relative link in an absolute one before giving the results. Anyway it should resolve into https://www.rdoproject.org/blog/tag/ instead, giving the full list of tags. This may be a bug in Middleman though.

Also I can't see the tags displayed, so there is probably a bug there as it's supposed to help browsing the blog with similar topics easily.

rbowen commented 6 years ago

All of the content from the original wiki was imported into Middleman 3 or 4 years ago, and the wiki was decommissioned. At the time, there was forwarding/redirection in place to preserve all of the hold Wiki URLs, but that has decayed over time, and a lot of that content has vanished as it became irrelevant from one version of the project to another.

There's a URL forwarding file at ... well, I can't find it. I'll keep looking.

However, at this point, we should probably eradicate any reference to the wiki and the old wiki URLs, as they just add complexity.

duck-rh commented 6 years ago

I think we removed these old redirection (and many of them were probably not linking anything still existing) when we cleaned-up the headers and other little things this year.

mary-grace commented 6 years ago

thanks for the help @duck-rh & @rbowen. I'll keep plugging away at these. The updated list definitely helps!

duck-rh commented 6 years ago

Tell me if you need another update.

We can experiment with other tools, possibly checking other errors. It would be nice to automate it and why not add it as a PR check (so errors are filtered out before being production). Regular checks would still be needed as the URLs might become invalid with time.

mary-grace commented 6 years ago

@duck-rh can you look into this error for me?

https://www.rdoproject.org/blog/2015/11/openstack-fuzztest/images/blog/restfuzz_graph.jpg
[u'404: Not Found']/blog/2015/11/openstack-fuzztest
['https://www.rdoproject.org/']

The code in the blogpost is:

  See [this image](/images/blog/restfuzz_graph.jpg) for a dependency graph of the network API.

it works just fine on my local machine, but in production it prepends the current path onto my relative link. I can switch it to an absolute link, but would like to figure out this bug as well.

mary-grace commented 6 years ago

fixed 2 more links: https://github.com/redhat-openstack/website/commit/584748e97ee53b4360ee74afcc37b00793e43707

duck-rh commented 6 years ago

@mary-grace the blog having moved to WP, I think you should ignore all related bad links and we'll do another check run later to catch WP problems (and in a new BR because that's really totally different tech now).

mary-grace commented 6 years ago

@duck-rh totally agreed. All of the blog links were pushed through on 1/16 with misc's help, so I'm hopeful that all of those changes applied to the WP dump (fingers crossed). Although now that I think about that, I think Jason might have done that export before the changes were pushed 😣 In any case, the only links I'm working on now deal with the /documentation section of the site, which I'm reorganizing with Petr & Rich's help. That should be done by early next week, at which point I'd like to run another report to see what's still missing. Thanks so much for all of your help!