gab-ai-inc / gab-dissenter-extension

Dissenter.com Browser Extension source code
https://dissenter.com
Apache License 2.0
272 stars 43 forks source link

Use rel=canonical to identify the proper URL #27

Closed Revisor closed 5 years ago

Revisor commented 5 years ago

Hi, some articles/content may have multiple URLs. Search engines have been incentivizing webmasters for years to deduplicate their content with the rel=canonical tag.

I think it would be useful for Dissenter to look at the rel=canonical tag to identify the proper URL.

mgabdev commented 5 years ago

Hi @Revisor thank you for the suggestion. We have edited our entire URL parsing system to account for URLs as they are input by people. The canonical URL tag has turned out to be not the perfect solution for identifying proper URLs as sometimes, for instance, the canonical URL is different than the given URL, or the webmaster/developer may have overlooked it, not included it, or simply duplicated it from another website thus resulting in an incorrect URL. We use a variety of implementations for parsing and identifying proper urls. If you have a more specific issue, please submit another issue. Closing this out now. Thank you.