CTPUG / wafer

A wafer-thin web application for running small conferences. Built using Django.
ISC License
46 stars 27 forks source link

Talks with non-Latin titles generate empty slugs #479

Open andrewshadura opened 5 years ago

andrewshadura commented 5 years ago

When I create a talk with a non-Latin title, for example Тестовый доклад, the generated slug is an empty string, which prevents the user submitting the talk from saving it:

NoReverseMatch at /talks/new/

Reverse for 'wafer_talk' with keyword arguments '{'pk': 2, 'slug': ''}' not found. 1 pattern(s) tried: ['talks/(?P<pk>\\d+)(?:-(?P<slug>[\\w-]+))?/$']

You may want to use e.g. transliterate module to transliterate the title for the purpose of generating the slug.

Alternatively the pattern can be modified to allow an empty slug.

drnlm commented 4 years ago

Crashing on an empty slug is something we should fix.

We also should do something better with unicode talk titles. I'm not sure what the best approach is though.

People can add arbitrary unicode characters in the talk title anyway, so I don't think transliterate is a good solution here, given it's design goals.

We could use django.text.utls allow_unicode option and create urls with unicode characters - that should work, although the percent-encoded url isn't going to be particularly readable

Something more like text-unidecode is an alternative, which will produce more ascii friendly urls, although that will also produce not particularly meaningful character strings for many languages.

andrewshadura commented 4 years ago

Why do you think transliteration isn’t a good solution? A lot of CMS do exactly that.

drnlm commented 4 years ago

I mean that the "transliterate" module isn't a good solution (https://pypi.org/project/transliterate/) - it supports a very limited set of languages, requires knowing the language of the title ahead of time to get good results and doesn't support arbitary non-language unicode characters sensibly.

Nothing prevents someone from creating a talk titled "♠♡♢", for example, and we need to handle that with whatever solution we come up with.

andrewshadura commented 4 years ago

Well, you need to know the language to get good results since languages with same alphabets often have different rules of transliteration. E.g. my first name transliterated from Russian is Andrey but it's Andrej if transliterated from Belarusian.

But maybe potentially bad transliteration is better than none. I don't know, I'm not sure.

hodgestar commented 4 years ago

What if we added a hook so that each site could override the slug function if needed? There could be a set of default functions based on site language.

stefanor commented 3 years ago

Talks now have a language attribute. We could transliterate from that language.