scrapinghub / splash

Lightweight, scriptable browser as a service with an HTTP API
BSD 3-Clause "New" or "Revised" License
4.04k stars 508 forks source link

Few URLs scraped through the splash instances returns "Your Browser Is No Longer Supported" error. #1122

Open dhamo-vairavel opened 3 years ago

dhamo-vairavel commented 3 years ago

Hi All,

Few/List of URLs scraped through the Splash instances (SplashRequest) returns the image with "Your Browser Is No Longer Supported" (Refer the screenshot attached for more info)

Python version - 3.7 Python packages used:

List of sample URLs: https://www.ashmoreavenida.com/impact/ https://www.bfycap.com/ https://www.lightship.capital/portfolio

LUA script used to reproduce the error: function main(splash, args) assert(splash:go(args.url)) assert(splash:wait(0.5)) return { html = splash:html(), png = splash:png(), har = splash:har(), } end

Screenshot: www ashmoreavenida com

dhamo-vairavel commented 3 years ago

Please let me know if you need more information on the issue or how to reproduce the error

davidkong0987 commented 2 years ago

Did you find a solution?

davidkong0987 commented 2 years ago

Using engine=chromium solves this problem for all three of your examples

note this only works for the endpoint render.html [https://github.com/scrapinghub/splash/issues/964]

eg http://0.0.0.0:8050/render.html?engine=chromium&wait=0.5&images=1&expand=1&timeout=90.0&url=https%3A%2F%2Fwww.bfycap.com%2F&lua_source=function+main%28splash%2C+args%29%0D%0A++assert%28splash%3Ago%28args.url%29%29%0D%0A++assert%28splash%3Await%280.5%29%29%0D%0A++return+%7B%0D%0A++++html+%3D+splash%3Ahtml%28%29%2C%0D%0A++++png+%3D+splash%3Apng%28%29%2C%0D%0A++++har+%3D+splash%3Ahar%28%29%2C%0D%0A++%7D%0D%0Aend

ezDataCode commented 2 years ago

I am also encountering a website that returns the image "Your Browser is No Longer Supported". Setting engine=chromium is an interesting approach, but is limited to endpoint render.html. Does any one have a solution when dealing with the other endpoints (e.g. render.json execute)?

vionwinnie commented 1 year ago

following this

liuhui244671426 commented 1 year ago

i am also encountering a website that name https://www.mech.hku.hk/people