Closed rtomayko closed 11 years ago
<meta>
tags on the manual pages themselves.rel="nofollow"
or anything weird like that.I don't get it.
In the equivalent of a rough straw poll, running the URLs through Stack Overflow (which puts nofollow
on external links) shows the Git Bisect page gets linked more than the blame man page.
Wildly extrapolate that on blogs and other guides that link to the kernal.org man page for blame instead and it just may be a situation outside of your control.
It also looks like the lack of better worded <title>
tags as mentioned in #151 doesn't help things.
The man page is
Git
whereas the debugging page that is leapfrogging here has
Git - Debugging with Git
And it was addressed back in #201 as well.
Yeah, #151 (and its duplicate #201) were fixed quite a long time ago. But these, just like many fixes to the AsciiDoc conversion, require the database to be rebuilt in order to take effect. Doing that would indeed resolve about 10 issues reported here, see my meta-issue #241 -- sadly, so far nobody with write access to gitscm-next so far has even acknowledged this problem publicly, not to even mention fixing it :-(...
Yes, as I believe I've said to Max before, I've been working on this on and off. Basically to rebuild the database I need to pull down a copy locally, rebuild it and push it back up. I need to figure out a way to do this easier, but for now this is the only way that won't cause regressions in other parts of the site, specifically the book. I actually worked on this for a bit today, but as with other times, I got caught up in other stuff.
There are also some caching issues that are affecting this, including the title issue, I believe - I don't think rebuilding the db will fix that, which is what you're actually worried about here. I haven't had time to dig into heroku caching details or how we're doing it wrong.
The titles should be generated properly now. Let's see if that helps things.
@schacon If you told me that before, then I am sorry but I must have totally missed it in the past months... :-). Aanyway. I don't want to sound ungrateful, to the contrary, my intents are good, and I am very grateful that you now found some time and that things run a lot smoother now, at least for the time being. Thanks!
And, thanks for the explanation. Caching definitely seems to be part of the issue. In my local gitscm instance, I also started to see stale data, and had to delete tmp/cache before restarting rails server
to make things work. So it seems to be a general issue with how RoR caches stuff (but then I know very little about RoR, so this might be again quite off).
Sadly, titles for most references pages still are "git", something that works just fine my local gitscm instance. Interestingly, though, http://git-scm.com/docs/git-stage/1.8.1 shows a "good" page title", while http://git-scm.com/docs/git-stage does not. Nor does for example http://git-scm.com/docs/git-stage/1.7.12
BTW, does anybody know who maintains http://www.kernel.org/pub/software/scm/git/docs/ ? Last update was May 2012. I tried to contact the kernel.org staff to find out, but got no response.
Right now I see the page title "Git" everywhere. So this issue is not resolved. :-(
I don't have a good sense for why this is the case but I'm surprised by the lack of results I get for git-scm.com when searching for manual page names. For example:
The git-scm.com URLs for these match the search and the
<h1>
also matches. Seems like we should be seeing these given that other git-scm.com domain results are coming back with high rankings.Is it possible that we're doing something that's causing these pages to get blacklisted in google? Like rendering different versions to Googlebot somehow or doing something fancy with js that'd hide the content to non-browser UAs?