ScottMansfield / widow

Distributed, asynchronous web crawler
GNU Lesser General Public License v2.1
26 stars 4 forks source link

Split links into raw and normalized #1

Closed ScottMansfield closed 9 years ago

ScottMansfield commented 9 years ago

The links are currently only raw text, which gives half the story. The links ought to be indexed as both the raw and normalized versions so the UI doesn't need duplicated logic to normalize links.