ViciousPotato / safaribooks

Convert safaribooksonline ebook to epub and Kindle mobi format
350 stars 78 forks source link

Can not crawl from safaribooksonline http 404. Double // in http://xxx//api/v1 #54

Closed andrewvn2010 closed 5 years ago

andrewvn2010 commented 5 years ago

2018-11-11 11:49:57 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch09.xhtml> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=xxxxxxxxxxxxx) 2018-11-11 11:49:57 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch09.xhtml>: HTTP status code is not handled or not allowed 2018-11-11 11:49:57 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch08.xhtml> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=xxxxxxxxxxxxx) 2018-11-11 11:49:57 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch08.xhtml>: HTTP status code is not handled or not allowed 2018-11-11 11:49:57 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch07.xhtml> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=xxxxxxxxxxxxx) 2018-11-11 11:49:57 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch07.xhtml>: HTTP status code is not handled or not allowed 2018-11-11 11:49:58 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch06.xhtml> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=xxxxxxxxxxxxx) 2018-11-11 11:49:58 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch06.xhtml>: HTTP status code is not handled or not allowed 2018-11-11 11:49:58 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch05.xhtml> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=xxxxxxxxxxxxx) 2018-11-11 11:49:58 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch05.xhtml>: HTTP status code is not handled or not allowed 2018-11-11 11:49:58 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/xxxxxxxxxxxxx/chapter/xxxxxxxxxxxxx_Ch04.xhtml> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=xxxxxxxxxxxxx)