google-code-export / django-page-cms

Automatically exported from code.google.com/p/django-page-cms
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

browsers can access CMS pages, crawlers/wget/curl get 404 page not found #208

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a site using 1.1.3
2. Load up some content
3. Try and visit the page with a browswer or wget

What is the expected output? What do you see instead?
I would expect that if Chrome/Firefox/IE can pull up the page, that it'll work 
universally.

Are you using the master version or a released version of this CMS on the
github repository?
No, I'm using 1.1.3 but willing to upgrade if this is a known problem.  I did 
search through the archives but didn't come up with anything that looks like 
this.

If you can write a test that reproduce the problem, there is better chance
it will be resolved quickly.

I've got a site up on the internet right now displaying precisely this problem. 
 http://www.usglobalmail.com/ will pull up in a browser but if you go to 
http://validator.w3.org/ and enter the same URL you'll get a 404.  Same for 
wget or curl.  Remarkably, lynx will pull up a page.

I wiresharked a conversation between the dev server on my local machine and 
several different clients.  Chrome (which worked), wget (which didn't) and curl 
(which didn't).  I'm happy to provide other information if necessary.

I'm fairly confident that it's the CMS application that's causing the problem, 
not nginx or django for several reasons.
1.  The static media files pull up just fine.  
http://www.usglobalmail.com/media/static/css/main.css pulls up both in a 
browser or via the w3c validator.
2.  The FAQ application (non-CMS) at http://www.usglobalmail.com/faq/ also 
pulls up in a browser or a crawler.
3.  The contact form (non-CMS) at 
http://www.usglobalmail.com/corporate-contact/ also pulls up irrespective of 
the method used.

It might not be the CMS proper, but perhaps something related?  I just don't 
know enough to figure out where the bug is coming from.  Could it be an 
encoding issue?  I'm not sure.

Any help you can give me is greatly appreciated.

Original issue reported on code.google.com by Vade...@gmail.com on 12 Sep 2010 at 10:15

Attachments:

GoogleCodeExporter commented 9 years ago
Hi,

A quick test here, it seems to be associated with the error 404 that
is raised in the default view:

   if lang not in [key for (key, value) in settings.PAGE_LANGUAGES]:
       raise Http404

wget http://76.12.209.254/individual/why-us <-- 404

wget http://76.12.209.254/individual/why-us --header='Accept-Language:
en' <--- 200 ok

Original comment by batiste....@gmail.com on 12 Sep 2010 at 11:08

GoogleCodeExporter commented 9 years ago
Wow!  I'm super impressed by how quickly you got back to me.

That's a great find.  I've got the CMS setup only to use English.  And wget is 
returning 'en-us' rather than just 'en'.  Curl is also sending 'en-us' so that 
seems to be the problem.

I had some problems very early on using 'en-us' in the settings file for the 
languages.  Should I be doing something differently?  Is there a way to map 
'en-us' to just 'en' elegantly?  

Original comment by Vade...@gmail.com on 12 Sep 2010 at 11:55

GoogleCodeExporter commented 9 years ago
Hi,

For the language mapping you have this:

http://packages.python.org/django-page-cms/settings-list.html#page-language-mapp
ing

But there is definitely something wrong happening. I don't think you should get 
this 404 error ever.

If you find out exactly what is the source of the error, please tell me so I 
can try to fix it in the CMS.

Original comment by batiste....@gmail.com on 13 Sep 2010 at 8:59

GoogleCodeExporter commented 9 years ago
Here is a change on the way the CMS deal with language that might solve your 
issue.

http://github.com/batiste/django-page-cms/commit/16be87d5003a3114baf0ebed746aa85
430120c56

But I still believe that there is something weird in your config. If 
PAGE_LANGUAGES != LANGUAGES, you should be sure that the mapping function is 
mapping all the extra languages properly to an language listed in the 
PAGE_LANGUAGES setting.

Original comment by batiste....@gmail.com on 13 Sep 2010 at 9:34

GoogleCodeExporter commented 9 years ago
Yes, I think you're right that I'm doing something wrong.  Here's the relevant 
portion of my settings.py file:

PAGE_DEFAULT_LANGUAGE = 'en'

# page languages
PAGE_LANGUAGES = (
    ( 'en', gettext('US English')),
)

Original comment by Vade...@gmail.com on 13 Sep 2010 at 3:58

GoogleCodeExporter commented 9 years ago
Did you manage to fix the issue? Can I close this bug?

Original comment by batiste....@gmail.com on 24 Sep 2010 at 8:12

GoogleCodeExporter commented 9 years ago
I just ran across this bug. The new code has fixed the issue here. Thanks!

Original comment by jonrh...@gmail.com on 5 Oct 2010 at 1:48

GoogleCodeExporter commented 9 years ago

Original comment by batiste....@gmail.com on 9 Oct 2010 at 9:21