benwbrum / fromthepage

FromThePage is a wiki-like application for crowdsourcing transcription of handwritten documents.
http://fromthepage.com
GNU Affero General Public License v3.0
171 stars 51 forks source link

Invalid integer (bad URL) in pagination link (Logs P3) #2067

Open benwbrum opened 4 years ago

benwbrum commented 4 years ago

Logs:


production.log.1-I, [2020-09-17T06:25:25.683066 #7929]  INFO -- : Started GET "/stanforduniversityarchives/jls/condolence-letters-re-death-of-leland-stanford-e-f-includes-adeline-m-easton-margaret-edes-george-f-edmunds-morris-m-estee-w-w-evans-leila-johnson-ewing-w-w-faris-stephen-j-and-sue-v-field-and-george-frere-flint?page=9/" for 85.31.186.210 at 2020-09-17 06:25:25 +0000
production.log.1-I, [2020-09-17T06:25:25.685180 #7929]  INFO -- : Processing by DisplayController#read_work as */*
production.log.1-I, [2020-09-17T06:25:25.685469 #7929]  INFO -- :   Parameters: {"page"=>"9/", "user_slug"=>"stanforduniversityarchives", "collection_id"=>"jls", "work_id"=>"condolence-letters-re-death-of-leland-stanford-e-f-includes-adeline-m-easton-margaret-edes-george-f-edmunds-morris-m-estee-w-w-evans-leila-johnson-ewing-w-w-faris-stephen-j-and-sue-v-field-and-george-frere-flint"}
production.log.1:I, [2020-09-17T06:25:25.716720 #7929]  INFO -- : Completed 500  in 31ms (ActiveRecord: 8.2ms | Allocations: 8455)
production.log.1-F, [2020-09-17T06:25:25.724484 #7929] FATAL -- :   
production.log.1-ArgumentError (invalid value for Integer(): "9/"):
production.log.1-  
production.log.1-app/controllers/display_controller.rb:43:in `read_work'
benwbrum commented 4 years ago

Another:


production.log.1-I, [2020-09-17T06:25:27.136440 #7929]  INFO -- : Started GET "/yaquinalights/1871-1900-yaquina-head-lighthouse-letter-books/1904-keeper-logs?page=2/" for 85.31.186.210 at 2020-09-17 06:25:27 +0000
production.log.1-I, [2020-09-17T06:25:27.139381 #7929]  INFO -- : Processing by DisplayController#read_work as */*
production.log.1-I, [2020-09-17T06:25:27.139701 #7929]  INFO -- :   Parameters: {"page"=>"2/", "user_slug"=>"yaquinalights", "collection_id"=>"1871-1900-yaquina-head-lighthouse-letter-books", "work_id"=>"1904-keeper-logs"}
production.log.1:I, [2020-09-17T06:25:27.171589 #7929]  INFO -- : Completed 500  in 31ms (ActiveRecord: 6.1ms | Allocations: 8443)
production.log.1-F, [2020-09-17T06:25:27.179263 #7929] FATAL -- :   
production.log.1-ArgumentError (invalid value for Integer(): "2/"):
production.log.1-  
production.log.1-app/controllers/display_controller.rb:43:in `read_work'
production.log.1-app/controllers/application_controller.rb:29:in `switch_locale'
benwbrum commented 4 years ago

This may be a crawler

benwbrum commented 4 years ago

More evidence:


production.log.1-I, [2020-09-17T06:25:58.828707 #7929]  INFO -- : Started GET "/yaquinalights/1871-1900-yaquina-head-lighthouse-letter-books/1900-01-keeper-logs?page=5/" for 85.31.186.210 at 2020-09-17 06:25:58 +0000
production.log.1-I, [2020-09-17T06:25:58.830697 #7929]  INFO -- : Processing by DisplayController#read_work as */*
production.log.1-I, [2020-09-17T06:25:58.830959 #7929]  INFO -- :   Parameters: {"page"=>"5/", "user_slug"=>"yaquinalights", "collection_id"=>"1871-1900-yaquina-head-lighthouse-letter-books", "work_id"=>"1900-01-keeper-logs"}
production.log.1:I, [2020-09-17T06:25:58.861339 #7929]  INFO -- : Completed 500  in 30ms (ActiveRecord: 6.0ms | Allocations: 8459)
production.log.1-F, [2020-09-17T06:25:58.869404 #7929] FATAL -- :   
production.log.1-ArgumentError (invalid value for Integer(): "5/"):
production.log.1-  
production.log.1-app/controllers/display_controller.rb:43:in `read_work'
production.log.1-app/controllers/application_controller.rb:29:in `switch_locale'
production.log.1-I, [2020-09-17T06:26:00.321026 #7929]  INFO -- : Started GET "/lva/wwi-va-questionnaires?page=5/" for 85.31.186.210 at 2020-09-17 06:26:00 +0000
production.log.1-I, [2020-09-17T06:26:00.323365 #7929]  INFO -- : Processing by CollectionController#show as */*
production.log.1-I, [2020-09-17T06:26:00.323682 #7929]  INFO -- :   Parameters: {"page"=>"5/", "user_slug"=>"lva", "id"=>"wwi-va-questionnaires"}
production.log.1:I, [2020-09-17T06:26:00.345508 #7929]  INFO -- : Completed 500  in 21ms (ActiveRecord: 4.6ms | Allocations: 5620)
production.log.1-F, [2020-09-17T06:26:00.352351 #7929] FATAL -- :   
production.log.1-ArgumentError (invalid value for Integer(): "5/"):
production.log.1-  
production.log.1-app/controllers/collection_controller.rb:83:in `show'
production.log.1-app/controllers/application_controller.rb:29:in `switch_locale'
benwbrum commented 4 years ago

User agent does not seem to be a crawler:

/var/log/apache2/other_vhosts_access.log.6.gz:www.fromthepage.com:443 85.31.186.210 - - [12/Sep/2020:14:58:45 +0000] "GET /khs/civil-war-governors-of-kentucky?page=3/ HTTP/1.1" 404 1698 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0"
/var/log/apache2/other_vhosts_access.log.6.gz:www.fromthepage.com:443 85.31.186.210 - - [12/Sep/2020:14:59:11 +0000] "GET /digitalindy/may-wright-sewall-papers?page=2/ HTTP/1.1" 404 1698 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0"
/var/log/apache2/other_vhosts_access.log.6.gz:www.fromthepage.com:443 85.31.186.210 - - [12/Sep/2020:14:59:19 +0000] "GET /digitalindy/may-wright-sewall-papers?page=8/ HTTP/1.1" 404 1698 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0"