Closed kevinreiss closed 11 years ago
A number of bad links have gotten into the resource description fields in the databases systems
128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:45:55 -0400] "GET /services/depository-access HTTP/1.1" 404 - "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:45:55 -0400] "GET /resource/title/%5Chttp://www.ntis.gov/pdf/dbguid.pdf%5C HTTP/1.1" 404 6799 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:45:56 -0400] "GET /databases/subject/%5Chttp://quod.lib.umich.edu/u/umhistmath/%5C HTTP/1.1" 404 6798 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:45:56 -0400] "GET /resource/%5Chttp://www.designinform.co.uk/database%5C HTTP/1.1" 404 6797 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:45:58 -0400] "GET /services/reserves/orrs HTTP/1.1" 404 - "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:45:58 -0400] "GET /resource/%5Cmailto:sbrooke@princeton.edu%5C HTTP/1.1" 404 6799 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:45:59 -0400] "GET /resource/%5Cmailto:thines@princeton.edu%5C HTTP/1.1" 404 6799 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:45:59 -0400] "GET /resource/%5Chttp://quod.lib.umich.edu/u/umhistmath/%5C HTTP/1.1" 404 6798 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117 - - [17/Jul/2013:13:46:00 -0400] "GET /user/404/contact HTTP/1.0" 200 7530 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:46:01 -0400] "GET /resource/title/%5Cmailto:sbrooke@princeton.edu%5C HTTP/1.1" 404 6797 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:46:02 -0400] "GET /databases/subject/%5Chttp://ebooks.library.cornell.edu/cgi/t/text/text-idx?page=simple&c=math%5C HTTP/1.1" 404 6797 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:46:03 -0400] "GET /resource/%5Chttp://epp.eurostat.ec.europa.eu/portal/page/portal/region_cities/introduction%5C HTTP/1.1" 404 6798 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:46:04 -0400] "GET /%23 HTTP/1.1" 404 6797 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:46:04 -0400] "GET /resource/%5Chttps://openknowledge.worldbank.org/%5C HTTP/1.1" 404 6797 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:46:05 -0400] "GET /databases/subject/%5Cmailto:thines@princeton.edu%5C HTTP/1.1" 404 6798 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:46:06 -0400] "GET /services/reserves/loan-periods-fines HTTP/1.1" 404 - "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:46:08 -0400] "GET /resource/%5Chttp://ebooks.library.cornell.edu/cgi/t/text/text-idx?page=simple&c=math%5C HTTP/1.1" 404 6799 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:52:01 -0400] "GET /resource/%5Chttp://www.ntis.gov/pdf/dbguid.pdf%5C HTTP/1.1" 404 6799 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117, 128.112.132.117 - - [17/Jul/2013:13:52:48 -0400] "GET /resource/%5Chttp://www.chinamaxx.net%5C HTTP/1.1" 404 6797 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)" 128.112.132.117 - - [17/Jul/2013:13:53:22 -0400] "GET /user/999/contact HTTP/1.0" 404 6798 "-" "gsa-crawler (Enterprise; T3-QBQ9FY7X3YSB2; google@princeton.edu)"
Fix
Bad link data had been imported. Have been cleaning this up daily based on 404 logs and staff reports. I believe it's been all resolved.
A number of bad links have gotten into the resource description fields in the databases systems
Fix