Basically I am also facing a problem where crowler4j do not recognize all links
on the page.
say for example there are 5 links existing on the page out of them only 3 gets
recognized and hence fetched. Rest two are not even recognized.
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
All the links in a page shall be recognized so that they can be fetched
What version of the product are you using?
4.1
Please provide any additional information below.
Only difference I found in the links which are not recognized is that these
links has angled bracket in it.
ex.
<a title="some text"
href="http://www.example.com/abc/xyz-<near>-abc-xyz/abc_xyz" >some text</a>
Original issue reported on code.google.com by amarvyaw...@gmail.com on 9 May 2015 at 12:47
Original issue reported on code.google.com by
amarvyaw...@gmail.com
on 9 May 2015 at 12:47