Closed playwolf719 closed 7 years ago
@playwolf719 /execute endpoint doesn't have special handling of 'headers' parameter (see http://splash.readthedocs.io/en/stable/api.html#execute). Probably that's the reason proxy doesn't work - auth is not correct. You need to handle it in your script, e.g. using splash:set_custom_headers.
By the way, instead of time.sleep(1)
it can be better to use DOWNLOAD_DELAY scrapy option.
@kmike Thx!!! It works, I have to use time.sleep for that my proxy auth is related with time. Thx anyway. But I have another question that the docker splash container is not that reliable. Sometimes It crashes. Do you have any suggestions?
@playwolf719 glad to see it helped!
I think for production it makes sense to run multiple Splash containers and use a load balancer, so that if one container crashes it can be restarted without affecting clients. You can implement it yourselves, use https://github.com/TeamHG-Memex/aquarium or use hosted Splash instance which takes care of that (like Scrapinghub's). See also: http://splash.readthedocs.io/en/stable/faq.html#how-to-run-splash-in-production
@kmike Thx again!
@kmike How to get the content that lua script return?
function main(splash)
splash:init_cookies(splash.args.my_cookie)
assert(splash:go{
splash.args.url,
http_method=splash.args.http_method,
headers=splash.args.headers,
})
splash:wait(1)
-- return splash:evaljs("document.title")
--return splash:evaljs([[
-- document.querySelector('#sf-item-list-data').innerText;
-- ]]);
--return {html=splash:html()}
local title = splash:evaljs("document.title")
return {title=title}
-- return {test=splash.args.my_cookie}
end
@playwolf719 see https://github.com/scrapy-plugins/scrapy-splash#responses: response.data['title']
In the first picture, I make the request with dynamic proxy and the endpoint is execute
In the second picture, I make the request with dynamic proxy and the endpoint is render,html
And I'm pretty sure that my proxy is ok. But why lua script does not work when using dynamic proxy? Hope you can help me. @kmike
This is my code