jesbin / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

Provide easy access to (absolute) canonical URL #245

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Page A might be at http://example.com/products/123

Page B might be at http://example.com/some-product , with <link rel=canonical 
href=http://example.com/products/123 >

It would be great to have page.getWebUrl().getCanonicalUrl() so that I can look 
up that URL in a database to avoid re-saving it. Thanks!

Original issue reported on code.google.com by neilmcgu...@gmail.com on 22 Nov 2013 at 2:10

GoogleCodeExporter commented 8 years ago

Original comment by avrah...@gmail.com on 18 Aug 2014 at 3:47