scrapinghub / splash

Lightweight, scriptable browser as a service with an HTTP API
BSD 3-Clause "New" or "Revised" License
4.09k stars 513 forks source link

Not redirected perfectly, if redirected URL specified in window.location.href #847

Open Mideen opened 5 years ago

Mideen commented 5 years ago

Hi team, While getting the page source of this url 'http://www.suedfargesa.com', I can't get the perfect one.

Here, The script tag contains "window.location.href='http://suedfargesa.com/web/'" .So that it should be redirected to 'http://suedfargesa.com/web/' page. But it hasn't.

have sharing the resposne screen shot. The Second redirected URL request has been blocked.

screenshot from 2018-12-18 21_12_47

Please help me to resolve this issue.

Mideen commented 5 years ago

Hi team, Even If redirected URL present is meta tag like "<meta http-equiv=\"refresh\" content=\"0;URL=/cgi-sys/defaultwebpage.cgi\">", It won't be redirected.

Few example ULRs http://www.aussiemumnetwork.com , http://www.nextlevelprinters.com

We can use meta tag attribute to redirect the page source :: https://mediatemple.net/community/products/dv/204645160/how-do-i-redirect-my-site-using-a-meta-tag

Please have a look on this ... Thanks in advance.

Mideen commented 5 years ago

@scrapinghub-ci @TEAM can anyone help me to solve this?

Mideen commented 5 years ago

Hi @lopuhin, Did you face any redirection issue like this before?

lopuhin commented 5 years ago

@Mideen no I haven't seen such issues, and I'm pretty sure that Splash does support such redirects in general.

Let me clarify one thing, in your first message, you say

The Second redirected URL request has been blocked.

So it appears redirect is happening but the browser can't load this page? What happens if you try to load that page directly?

Mideen commented 5 years ago

Hi @lopuhin

Thanks for your response ..!

So it appears redirect is happening but the browser can't load this page? What happens if you try to load that page directly?

Yes, The browser can't load the second page, The second request has been abandoned by splash browser. I don't know the reason for this action.

If I load the redirected page directly then it will render the page perfectly.

Is there any other configuration or any other workaround to fix this issue? I have increased my wait time and checked, even though it is not fixed.

lopuhin commented 5 years ago

@Mideen I checked http://suedfargesa.com and indeed with 0.5 wait time I got a similar error. Increasing wait time to 3.5 seconds helped to render the page correctly. Please see this FAQ item: https://splash.readthedocs.io/en/stable/faq.html#website-is-not-rendered-correctly

Mideen commented 5 years ago

Hi @lopuhin

Yes, If I set the wait time more than 3.5 sec. It is rendered perfectly.

But, In common code, if I set 3.5 sec it will affect my all other URLs. I mean all other URLs will wait 3.5 secs after rendering. Am I correct?

Is there any other solution/workaround to solve this issue?