dkm05midhra / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

Add functionality to retrieve links from binary and text only files #311

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Currently, while crawling, the crawler parses all links from every html link, 
then uses those links as seeds.

But when encountering Binary or plain text (text/plain) files those links 
aren't parsed and retrieved as seeds.

Upgrade the crawler to parse links from binary and text files.

Original issue reported on code.google.com by avrah...@gmail.com on 23 Sep 2014 at 11:08

GoogleCodeExporter commented 9 years ago
Fixed in rev: 1ac149397bef  

Original comment by avrah...@gmail.com on 23 Sep 2014 at 11:14