mohankreddy / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

Crawler not found window.location url #86

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hello,

I have a issue with Crawler4j. 
I have many HTML pages related to each other by an OnClick event:

<input type="button" class="Button" 
onClick="window.location='../pagefr/menu_0402.html'" value="Go" />

Crawler4j doesn't find these links and that's a real issue for me because I 
cannot change the type of link...

Thank you

Original issue reported on code.google.com by sandro...@gmail.com on 12 Oct 2011 at 9:37

GoogleCodeExporter commented 9 years ago
I found the issue, that the crawler will not be able to follow redirections 
like:

<html>
<head></head>
<body>
<script language="JavaScript">
    window.location = "http://example.com";
</script>
</body>
</html>

It would be nice, if the crawler would be able to follow such redirections.

Original comment by lubienet...@gmail.com on 29 May 2013 at 9:15

GoogleCodeExporter commented 9 years ago
Please supply an example URL so I can check this

Original comment by avrah...@gmail.com on 11 Aug 2014 at 12:54

GoogleCodeExporter commented 9 years ago

Original comment by avrah...@gmail.com on 18 Aug 2014 at 3:09

GoogleCodeExporter commented 9 years ago

Original comment by avrah...@gmail.com on 18 Aug 2014 at 3:10

GoogleCodeExporter commented 9 years ago
I think this should be controlled because you might not want the crawler to 
crawl anything except href tags. 

Original comment by jasonbro...@gmail.com on 27 Nov 2014 at 2:03