Open lukas-vlcek opened 10 years ago
Some documents contain double // "in the middle" of its URL. We should consider removing the extra / when displaying URL for such document in search results.
//
/
For example we can get document containing URL like this:
http://www.jboss.org//archetypes/eap/jboss-html5-mobile-archetype-wfk/index.html
Note the double // in "...jboss.org//archety...". It looks strange in search results page:
...jboss.org//archety...
Google is showing only a single / (may be it is just given a different list of URLs to crawl?):
Anyway, still the biggest issue can be that the document exists under two different URLs (can this be penalized by search engines?):
Should we rather ask the content provider (@pmuir) to have a look at this and fix it directly in the indexer instead?
Also see relevant StackExchange discussion, it might be a bit dated but some points can be still relevant.
Some documents contain double
//
"in the middle" of its URL. We should consider removing the extra/
when displaying URL for such document in search results.For example we can get document containing URL like this:
Note the double
//
in "...jboss.org//archety...
". It looks strange in search results page:Google is showing only a single
/
(may be it is just given a different list of URLs to crawl?):Anyway, still the biggest issue can be that the document exists under two different URLs (can this be penalized by search engines?):
Should we rather ask the content provider (@pmuir) to have a look at this and fix it directly in the indexer instead?
Also see relevant StackExchange discussion, it might be a bit dated but some points can be still relevant.