st4lk / django-solid-i18n-urls

[DEPRECATED] Use default language for urls without language prefix.
http://www.lexev.org/en/
Other
112 stars 29 forks source link

Fails on slugs starting with a language name #18

Closed mchccc closed 8 years ago

mchccc commented 9 years ago

I'm using this rule to match slugs:

url(r'^(?P<segments>(([\w-]+)\/)+)$', views.page_by_slug, name='page_slug')

and I'm getting 404 on URLs that have a space (converted to a dash) after characters that happen to be a language name, like this:

http://localhost:8000/it-dsfgggg/

If I remove the dash, or even simply move it, the view is correctly found. Is there anything I could do to avoid this? Thanks!

st4lk commented 9 years ago

Hi, thanks for reporting! Can you precise version of solid-i18n, version of django, solid-i18n settings (if you use some)

mchccc commented 9 years ago

I'm using solid-i18n==0.9.1 and Django==1.7.7, settings activated are SOLID_I18N_USE_REDIRECTS = False and SOLID_I18N_DEFAULT_PREFIX_REDIRECT = True. Thank you!

st4lk commented 9 years ago

As i understand, this is a feature of django i18n itself. If we'll use django's default i18n process, i.e. LocaleMiddleware and i18n_patterns we'll still get described behaviour with 404 response. Example: settings.py

LANGUAGE_CODE = 'it'
LANGUAGES = (
    ('it', 'Italian'),
    ('en', 'English'),
)

urls.py:

...
from django.conf.urls.i18n import i18n_patterns
urlpatterns = i18n_patterns('',
    url(r'^(?P<segments>(([\w-]+)\/)+)$', views.page_by_slug, name='page_slug'),
)

Request http://localhost:8000/it-dsfgggg/ will result in 404.

It happens, because language tag can contain both language subtag and region subtag divided by dash. For example, this is normal language tag: en-US. So, when django discover language from path, it takes /it-dsfgggg/ and first tries to find it-dsfgggg supported language tag. When it fails, it simple takes generic language subtag, which is it. After that django check url if it contains found language: /it/. Our url is not starts with this path and we get 404.

What we can do with this?

Well, one way is to forbid django to look for region subtags. For this, modify solid_i18n/middleware.py.

Add more strict function for getting language from path:

import re
language_code_generic_only_prefix_re = re.compile(r'^/([\w]+)(/|$)')

def get_language_from_path_generic_only(path):
    if language_code_generic_only_prefix_re.match(path):
        language_from_path = trans.get_language_from_path(path)
    else:
        language_from_path = None
    return language_from_path

And replace all occurrence of trans.get_language_from_path to get_language_from_path_generic_only in solid_i18n/middleware.py (i see two).

Haven't tested it in many scenarios, but seems like it works. I don't know, should we add this in package, as such case is rare i suppose. But it is worth thinking, maybe it will be useful to add some setting, that will enable this strict path checking.

mchccc commented 9 years ago

Thank you very much for the detailed answer! I'm not used to subtags so I did not think about them, but it makes perfect sense. If the application does not need this level of detailed translation, I think it might make sense to disable them, but on the other hand it is reasonable to avoid having language names in the slug.. We'll be thinking this over, but thanks a lot for your precious help!

DrMeers commented 8 years ago

I just got bitten by this, wondering why /my-slug/ was activating Burmese!

Django's get_language_from_path function accepts a strict argument which defaults to False -- perhaps we should add a setting to allow passing True to this?

DrMeers commented 8 years ago

However ideally we'd also pass strict to get_language_from_request for consistency, but for some reason this option does not exist in Django's source code.