sanskrit-lexicon / COLOGNE

Development of http://www.sanskrit-lexicon.uni-koeln.de/
18 stars 3 forks source link

Orphan Pages (pwindex.html & pwgindex.html) #282

Open gasyoun opened 5 years ago

gasyoun commented 5 years ago

While reading https://de.wikipedia.org/wiki/Otto_von_B%C3%B6htlingk I noticed at the very bottom that it links to https://www.sanskrit-lexicon.uni-koeln.de/pwgindex.html and even the logo does not link to homepage. I could update the Wikipedia links to https://www.sanskrit-lexicon.uni-koeln.de/scans/PWGScan/2013/web/index.php or maybe it's time to kill such orphan pages? I've seen many people using old versions of Cologne dictionaries without no need.

drdhaval2785 commented 3 years ago

@gasyoun How do we come to know how many such dead pages exist?

gasyoun commented 3 years ago

how many such dead pages exist?

Technically it's a broken link. There is no easy way to find what's broken on Wikipedia - there are too many of them. In this case I noticed it manually. But there is software I use like Xenu that can help in some parts of this rather big task. Out of the 29616 URLs at Cologne 1381 URLs (4.45%) not found. Jim, please take a look at https://github.com/sanskrit-lexicon/CORRECTIONS/blob/master/Broken%20link%20report%202020.htm

Samples:

https://www.alanwood.net/unicode/fonts.html
error code: 6 (invalid file handle), linked from page(s):
    https://www.sanskrit-lexicon.uni-koeln.de/scans/PWScan/disp2/help.html
https://www.sanskrit-lexicon.uni-koeln.de/scans/MWScan/MWScan/mw0152-ArtveyI.
error code: 404 (not found), linked from page(s):
    https://www.sanskrit-lexicon.uni-koeln.de/cgi-bin/serveimg.pl?file=/scans/MWScan/MWScan/mw0152-ArtveyI.
http://sphinx-doc.org/
redirected to: http://www.sphinx-doc.org/
status code: 302 (object temporarily moved)
linked from page(s):
    https://www.sanskrit-lexicon.uni-koeln.de/scans/csldev/csldoc/build/contrib.html

Nice time to kill the www

http://www.sanskrit-lexicon.uni-koeln.de/CDSL.pdf
redirected to: https://www.sanskrit-lexicon.uni-koeln.de/CDSL.pdf
status code: 301 (object permanently moved)
linked from page(s):
    https://www.sanskrit-lexicon.uni-koeln.de/scans/csldev/csldoc/build/intro.html