anka-213 / webcomic_reader

Webcomic Reader userscript at
https://openuserjs.org/scripts/anka-213/Webcomic_Reader
MIT License
120 stars 27 forks source link

Heads up: Cleaning up dead sites #138

Closed SoraHjort closed 4 years ago

SoraHjort commented 4 years ago

Just wanted to put this out to let you know that I'm going through the script to clean up dead and redundant sites. I'm currently half way through, checking about 100 a work day, roughly. Got to entry #400, with 110~ dead sites.

I'm also marking down Broken and working sites. As well those that have new Domain Names, redirecting, merged, special notes, and so on. I'll have a full rundown when completed.

I wanted to put this out there so that if in the next two or so weeks you see a pull from me with a massive decrease in character count, this is what's going on. I'll have the full list available to be double checked if anyone wanted to. But I should give the warning, a lot of them have been squatted on by malicious sites.

Had one tried to lock up my browser while yelling "YOUR COMPUTER IS INFECTED". ¬_¬ *sigh*

I'll probably also fix a few sites in the process, but not that many. As there are atleast 130 broken counted so far. Meaning roughly 90 so far checked are working with the script.

So yeah, massive character count decrease is incoming.

SoraHjort commented 4 years ago

Ok, here is the list of sites that are dead, broken, and live (as well as other notes). It is categorized, and (mostly) alphabetized. https://github.com/SoraHjort/The-note-you-are-trying-to-make-is-too-big-for-pastebin-repo/blob/master/webcomic%20reader%20clean%20up%20list.txt

199 sites will be culled, most of which are dead. Others are Dead subdomains, redirects to other domains, new hosting on sites like comicgenesis. Or they simply don't host archives anymore.

Rough Rundown: Total Sites: 614 Dead: 176 Broken: 221 Working: 153 Moved: 38 No longer hosting: 4 Other: 22 (see various notes)

Now the list is done, now comes the actual clean up.

SoraHjort commented 4 years ago

Alright, that should take care of that for now. outside of some checks set for a month or so down the line. (sites supposedly going through redesigns, or recently went down.) There's also some in the special notes section and what not.

My thoughts are.. well... So much of what was cut out was manga sites. They were a large portion (not majority) of dead sites, but they may have been the majority of code that was pruned due to the various methods to run the detection.

Honestly, they were such a large part that if I were asked to add a new one in, I'd decline. Since who knows whether or not they'll be around within a year's time. Granted, you could say the same for any new comic that crops up. Lord knows over the years I've been countless comics start up on Keenspace (and Comicgenesis) only to die within pages.

But with Manga sites, many are in a grey area when it comes to legality. This means they can go down due to C&Ds and what not. So my recommendation there is when someone asks to add in a new such site, only consider it if it is either:

A. several years old. B. atleast a year old, but also a site that isn't legally grey. Such as those that sell issues or subscriber based.

So yeah, that should do for the clean up for now. Nearly 2200 deleted lines. Nearly 80KB. I'll worry about fixing some of the sites at a later date.