Open GoogleCodeExporter opened 8 years ago
This is a serious issue, and is impacting the crawler results on our site. Are
there any workarounds or fixes for this issue which we can use?
Thanks for your inputs
Original comment by amit.aro...@gmail.com
on 20 May 2011 at 10:29
You can try to apply this patch as a workaround.
src/common/basefilter.cc
Original comment by okaren...@gmail.com
on 30 Sep 2011 at 4:02
Attachments:
Thanks for the patch. I have applied the patch and recompiled the code on 32
bit Centos 4.2 and installed the new version of the sitemap generator but I'm
still getting 404 pages added to the sitemap. Any suggestions?
Original comment by rob.ba...@gmail.com
on 15 Jul 2012 at 2:26
Any info on how to apply this patch would be useful. Thanks.
Original comment by zu...@wsg.co
on 14 Mar 2014 at 5:08
Patch is not necessary if you're running latest version of GSG. In my case, 404
pages were included in sitemaps as the server returned 200 http response code,
even though the page was a 404 page. Basically, I was setting code to 404 and
later in the code changing it to 200, without noticing it.
So first thing you need to do is to find out whether your server for specific
page is actually returning a 404 error code. If not - you gotta fix it first.
Once fixed, I followed these steps to regenerate sitemaps:
1. Stopped GSG daemon.
2. Removed cache folder from GSG installation path.
3. Removed sitemaps from website folder.
4. Started GSG daemon.
After that it took some time for GSG to regenerate sitemaps, as they get all
deleted completely, but the new sitemaps do not include 404 pages.
Hope this helps.
Original comment by zu...@wsg.co
on 27 Mar 2014 at 1:57
Original issue reported on code.google.com by
pastordanwalker@gmail.com
on 24 Nov 2010 at 5:14