Simpsonpt / AppSecEzine

AppSec Ezine Public Repository.
1.09k stars 96 forks source link

Fixed dead links #2

Closed Balhau closed 2 years ago

Balhau commented 3 years ago

~One dead link. Should we remove it?~

Used waybackmachine to recover links

m1el commented 3 years ago

Is there a way to crawl all the links and check if they're dead?

Balhau commented 3 years ago

Is there a way to crawl all the links and check if they're dead?

That's a nice idea. Not sure if that is so trivial however. We got different type of errors. We got dns error type errors we got response from the server but saying the content is not available as well. In this last type of errors is a bit more tricky to automate since the response varies highly from link to link. Some will respond with 404 http. Others simply redirect the page to other content. Well its a bit tricky imo. But the idea is good anyway. Maybe we can think of something.

Simpsonpt commented 3 years ago

Hey, first of all, thank you for trying to improve the project! :)

I have some notes on fixing dead links. It's factual that some links already point to Wayback Machine or similar, but they are like that because they were not available anymore at the time of the issue release.

As pointed in the last comment, the variety of results to detect a dead link makes automation a little bit tricky, although there are some easy wins like status 404 and similar. But having this said, I think the repo should stay with the original/released link even if broken nowadays and not with the cached version.*

Why? It will preserve what was released and make it easy to search for the content in Wayback Machine, Google Cache (cache: dork), or any other service with the same goal if needed.

This improvement should happen on side projects, like 5th-year celebration "book" or on the bot that feeds 0XOPOSEC Twitter account. Where we can parse the link from an issue, validate if it is still valid and if not, pick a "restore" strategy.

*I exceptionally update old links when I notice that an URL change happened, but the content is still there, only on a different address (//URL/content -> //URL/new/content), or during the week of the issue and a link is gone, and I want to share it anyways.