Closed itrajanovska closed 1 year ago
Actually can we check the next page extraction please, I just encountered an error...
Actually can we check the next page extraction please, I just encountered an error...
2023-01-12 19:02:56 [scrapy_splash.middleware] WARNING: Bad request to Splash: {'error': 400, 'type': 'ScriptError', 'description': 'Error happened while executing Lua script', 'info': {'source': '[string "..."]', 'line_number': 20, 'error': 'http404', 'type': 'LUA_ERROR', 'message': 'Lua error: [string "..."]:20: http404'}}
2023-01-12 19:02:56 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.otto.de/technik/smartphone&l=gq&o=116 via http://splash:8050/execute> (referer: None)
2023-01-12 19:02:56 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.otto.de/technik/smartphone&l=gq&o=116>: HTTP status code is not handled or not allowed
2023-01-12 19:02:56 [scrapy.core.engine] INFO: Closing spider (finished)
Actually can we check the next page extraction please, I just encountered an error...
2023-01-12 19:02:56 [scrapy_splash.middleware] WARNING: Bad request to Splash: {'error': 400, 'type': 'ScriptError', 'description': 'Error happened while executing Lua script', 'info': {'source': '[string "..."]', 'line_number': 20, 'error': 'http404', 'type': 'LUA_ERROR', 'message': 'Lua error: [string "..."]:20: http404'}} 2023-01-12 19:02:56 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.otto.de/technik/smartphone&l=gq&o=116 via http://splash:8050/execute> (referer: None) 2023-01-12 19:02:56 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.otto.de/technik/smartphone&l=gq&o=116>: HTTP status code is not handled or not allowed 2023-01-12 19:02:56 [scrapy.core.engine] INFO: Closing spider (finished)
Actually can we check the next page extraction please, I just encountered an error...
2023-01-12 19:02:56 [scrapy_splash.middleware] WARNING: Bad request to Splash: {'error': 400, 'type': 'ScriptError', 'description': 'Error happened while executing Lua script', 'info': {'source': '[string "..."]', 'line_number': 20, 'error': 'http404', 'type': 'LUA_ERROR', 'message': 'Lua error: [string "..."]:20: http404'}} 2023-01-12 19:02:56 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.otto.de/technik/smartphone&l=gq&o=116 via http://splash:8050/execute> (referer: None) 2023-01-12 19:02:56 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.otto.de/technik/smartphone&l=gq&o=116>: HTTP status code is not handled or not allowed 2023-01-12 19:02:56 [scrapy.core.engine] INFO: Closing spider (finished)
Thanks for noticing this, it was due to the removed filter, I handled that now and tested it locally, so it works for both in a different way
https://www.otto.de/heimtextilien/bettwaesche/?nachhaltigkeit=alle-nachhaltigen-artikel&l=gq&o=117
https://www.otto.de/technik/smartphone/?l=gq&o=116
Add LAPTOP, TABLET, TV and HEADPHONES for otto; Add new 'UNAVAILABLE' label; Add test for unsustainable products.