jazzband / django-taggit

Simple tagging for django
https://django-taggit.readthedocs.io
BSD 3-Clause "New" or "Revised" License
3.34k stars 622 forks source link

Some unicode tags return empty slug #347

Open tpeaton opened 9 years ago

tpeaton commented 9 years ago

Certain unicode doesn't yield a usable slug.

>>> obj.tags.add('(ง’̀’́)ง')
>>> obj.tags.all()
[<Tag: ง’̀’́)ง>]
>>> obj.tags.all()[0].slug
u''

We've solved this internally by doing this:

from shortuuid import uuid

name = data['name'].encode('ascii', 'ignore').decode('utf-8')
if not taggit_slugify(name):
    data['slug'] = 'tag-{}'.format(uuid())

Seem reasonable? If so, I can add it to this method and create a PR.

frewsxcv commented 9 years ago

Do you have 'unidecode' installed?

tpeaton commented 9 years ago

I do not.

frewsxcv commented 9 years ago

So if you install that, it might fix this issue. We should probably add a note about it in the docs somewhere....

Relevant PR: https://github.com/alex/django-taggit/pull/315

tpeaton commented 9 years ago

That seems to work fine, thanks!

uksmartsolutions commented 2 years ago

Yes, installing "unidecode" fixed issues. (pip install unidecode)

AliIslamov commented 1 year ago

I have installed unidecode, but my ciryllic tags still have empty slugs.

rtpg commented 1 year ago

@AliIslamov could you provide a snippet of how you are creating your tags, along with an assertion that the slug is indeed the empty string?

In particular, could you try things like calling tag.slugify(tag_string) to confirm that the slugify method is returning an empty string?

rtpg commented 1 year ago

I would like to add more tests or something here to work through this issue but am having a hard time conceptualizing at what part of the system this is failing

AliIslamov commented 1 year ago

@AliIslamov could you provide a snippet of how you are creating your tags, along with an assertion that the slug is indeed the empty string?

In particular, could you try things like calling tag.slugify(tag_string) to confirm that the slugify method is returning an empty string?

I already have solved my problem.

1) Have written in settings.py:

TAGGIT_STRIP_UNICODE_WHEN_SLUGIFYING = True

2) Have installed unidecode:

pip install unidecode

3) Have written in models.py for my application:

from unidecode import unidecode

And now everything works well 👍

AliIslamov commented 1 year ago

I would like to add more tests or something here to work through this issue but am having a hard time conceptualizing at what part of the system this is failing

We can conclude that while unicode is enabled by default, cyrillic tags get an empty slug.

The situation is solved by forcing unicode to ascii convertation, installing the unidecode module and importing it into the models.py file, in which the TaggableManager() field is present.

In my opinion it makes sense to add this instruction in documentation.