Add functionality to retrieve links from binary and text only files

2848800476 / crawler4j

Automatically exported from code.google.com/p/crawler4j

0 stars 0 forks source link

Add functionality to retrieve links from binary and text only files #311

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago

Currently, while crawling, the crawler parses all links from every html link, 
then uses those links as seeds.

But when encountering Binary or plain text (text/plain) files those links 
aren't parsed and retrieved as seeds.

Upgrade the crawler to parse links from binary and text files.

Original issue reported on code.google.com by avrah...@gmail.com on 23 Sep 2014 at 11:08

GoogleCodeExporter commented 8 years ago

Fixed in rev: 1ac149397bef

Original comment by avrah...@gmail.com on 23 Sep 2014 at 11:14

Changed state: Fixed