Unfortunately, as I mentioned, recently, I believe the website maintainers changed something on the back end, and now the site is no longer rendering correctly.
The expected result would be a typical login screen where it asks for User and Password. Instead, it essentially only shows the navbar and the footer.
Waiting to ensure the site renders completely using 'splash:wait'
Specifying different user agents using splash:set_user_agent
Disabling private mode (using --disable-private-mode or splash.set_private_mode_enabled = false
I normally run splash with the command (on Ubuntu linux 20):
' sudo docker run -p 8050:8050 --memory=1G --restart=always scrapinghub/splash --disable-private-mode --max-timeout 3600 --maxrss 1024 -v3'
Note that I'm able to access this page (and see the login page) using a normal browser (I've used both Safari and Firefox).
I guess my main question is... is there something that I can do to get this to render again, or is the version of splash WebKit simply incompatible?
I currently have a webapp where I'm using scrapy combined with splash to parse a number of utility sites. If splash is no longer capable of rendering websites using modern javascript, then I may need to move to some other solution. This is a bummer to me, because so far I've been happy with the performance and capabilities.
Thanks in advance for any assistance you could provide.
P.S. If there's any other supporting information that I could give, please let me know!
Hi there!
Kudos to you guys for making some amazing software!
Up until recently, I've been able to successfully parse this website 'https://www.eversource.com/security/account/login' (amongst many others).
Unfortunately, as I mentioned, recently, I believe the website maintainers changed something on the back end, and now the site is no longer rendering correctly.
The expected result would be a typical login screen where it asks for User and Password. Instead, it essentially only shows the navbar and the footer.
I've reviewed and tried all of the suggestions made in https://splash.readthedocs.io/en/stable/faq.html#website-is-not-rendered-correctly Most notably:
I normally run splash with the command (on Ubuntu linux 20): ' sudo docker run -p 8050:8050 --memory=1G --restart=always scrapinghub/splash --disable-private-mode --max-timeout 3600 --maxrss 1024 -v3'
Currently, I'm running the following versions: [-] Splash version: 3.5 [-] Qt 5.14.1, PyQt 5.14.2, WebKit 602.1, Chromium 77.0.3865.129, sip 4.19.22, Twisted 19.7.0, Lua 5.2 [-] Python 3.6.9 (default, Jul 17 2020, 12:50:27) [GCC 8.4.0]
The easiest way to reproduce this issue would be to run the following in the splash UI aka (http://localhost:8050):
`function main(splash, args)
splash.resource_timeout = 0 splash.private_mode_enabled = false splash:set_user_agent('Mozilla/5.0 (Windows NT 6.1; rv:51.0) Gecko/20100101 Firefox/51.0')
local login_url = 'https://www.eversource.com/security/account/login'
assert(splash:go(login_url)) assert(splash:wait(10))
return { html = splash:html(), png = splash:png(), har = splash:har(), } end`
The only clues I have seen are a few errors in the verbose output of splash when run with '-v3'. Specifically, I see the following:
'[render] JsConsole(https://www.eversource.com/content/UserControls/PrimaryNavNew/PrimaryNavNew.ascx.js:69): TypeError: item of items is not a function. (In 'item of items', 'item of items' is undefined)
[render] JsConsole(https://www.eversource.com/content/WebsiteTemplates/NU/js/AppD/jsagent/adrum/adrum.js:27): TypeError: |this| is not a object
[render] JsConsole(https://cdn.eversource.com/prod/ms-login/2022.2.2.13/static/js/main.bundle.js:2): TypeError: |this| is not a object '
Note that I'm able to access this page (and see the login page) using a normal browser (I've used both Safari and Firefox).
I guess my main question is... is there something that I can do to get this to render again, or is the version of splash WebKit simply incompatible?
I currently have a webapp where I'm using scrapy combined with splash to parse a number of utility sites. If splash is no longer capable of rendering websites using modern javascript, then I may need to move to some other solution. This is a bummer to me, because so far I've been happy with the performance and capabilities.
Thanks in advance for any assistance you could provide.
P.S. If there's any other supporting information that I could give, please let me know!