flebel / django-tagging

Automatically exported from code.google.com/p/django-tagging
Other
0 stars 0 forks source link

Enhancement: Tags with more than one word #43

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I love django-tagging, but I had the need for tags with multiple words.  For 
example, if I wanted 
to tag an item as "stained glass", but didn't want it to break into two tags, I 
would have to use 
"stained-glass" or "stainedglass", neither of which I liked very much.

With just two tiny changes in utils.py, I was able to add the ability to 
designate a multiple word 
string as a single tag using quotes, without (as far as I can tell) affecting 
the previous 
functionality.  So now, the user can type 'window "stained glass" blue' 
(without single quotes) and 
the generated tags will be "window", "stained glass", and "blue".

Here's the changed section of my utils.py:
find_tag_re = re.compile('\"[-\w\s]+\"|[-\w]+', re.U)

def get_tag_name_list(tag_names):
    """
    Find tag names in the given string and return them as a list.
    """
    if not isinstance(tag_names, unicode) and tag_names is not None:
        tag_names = tag_names.decode(settings.DEFAULT_CHARSET)
        tag_names = re.sub('\s+', ' ', tag_names)
    results = find_tag_re.findall(tag_names or '')
    return [item.encode(settings.DEFAULT_CHARSET).replace('"', '') for item in results]

Note that I added the re.sub line to get rid of extra spaces that might crop up 
from this.

Hope this is useful to someone else.

Branton

Original issue reported on code.google.com by sansm...@gmail.com on 2 Jun 2007 at 9:52

GoogleCodeExporter commented 9 years ago
> Hope this is useful to someone else.

Of course, thanks for sharing.

Original comment by jefferso...@googlemail.com on 3 Jun 2007 at 11:11

GoogleCodeExporter commented 9 years ago
I hope this patch will be merged.

I also use clean_data() in my forms to apply some formating to the tag string.

Original comment by samuel.a...@gmail.com on 4 Jun 2007 at 10:24

GoogleCodeExporter commented 9 years ago
The reason this hasn't been merged, and that I haven't marked the other ticket
related to this feature as started, is that this isn't the only change that 
needs to
be made for this feature and there are also some important design decisions to 
be
made - off the top of my head:

- Validation, both in tagging.validators and in tagging.forms.TagField will 
need updated.
- The Tag model will require an update, as with this patch alone, you wouldn't 
be
able to edit the new multi-word tags in the admin application - its save method 
would
also have to remove any quotes present in valid tags for the admin.
- tagging.fields.TagField would need updated, as given a tag input 'test "some 
tag"',
it would return 'test some tag' when you access tags using it - how should 
output for
multi-word tags be represented? Should we provide utilities for working with
multi-line tags, something like django.utils.text.smart_split?

We also need to consider whether or not do any normalisation - e.g. in Flickr I 
can
create a "test test" tag, which will for all intents and purposes behave 
exactly the
same as a "testtest" tag (and I won't be able to add a "testtest" tag in 
addition to
it), apart from how it's displayed on the screen. Also, should the type of
normalisation used be configurable?

Dealing with removing quotes and normalisation at the Tag.save() level instead 
of in
the get_tag_name_list function (which is deliberately very forgiving compared to
other tag processing methods for ease of input) should let us deal with all 
this in
one place.

I would also if possible like to make this feature itself toggleable using a 
setting.

Original comment by jonathan.buchanan on 4 Jun 2007 at 10:43

GoogleCodeExporter commented 9 years ago
Jonathan,

Thanks for you comment.  Like I said, I really wasn't sure if this would affect 
anything else, as I clearly haven't 
read through all of the code of django-tagging.  Maybe in a week or so I'll 
have more time to check it out more.

Original comment by sansm...@gmail.com on 5 Jun 2007 at 1:12

GoogleCodeExporter commented 9 years ago
Good idea for a separator setting.

If one needs multi-word tags, he could just use the comma separator.

Original comment by samuel.a...@gmail.com on 6 Jun 2007 at 12:17

GoogleCodeExporter commented 9 years ago
This is how i deal with multi-word tags.

I only set and get the tags through the model instance and i tell users to use 
commas
to separate tags.

class MyModel(models.Model):
    # some fields ...
    tag_list = models.CharField(maxlength=255)

    def save(self):
        # remove duplicates ( python 2.4+ ) and order tags
        self.tag_list = ', '.join(list(set([ re.sub(r'\s+', ' ',
tag.strip().lower()[:40]) for tag in self.tag_list.split(',')])))
        super(Subject, self).save()
        self.tags = self.tag_list

    def _get_tags(self):
        return Tag.objects.get_for_object(self)
    def _set_tags(self, tag_list):
        Tag.objects.update_tags(self, self._slugify_tags(tag_list))
    tags = property(_get_tags, _set_tags)

    # used inside templates
    # {% for tag in object.url_tags %}
    # <a href="/tag/{{ tag.slug }}/">{{ tag.name }}</a> 
    def _url_tags(self):
        tag_dict = ()
        for tag in self.tag_list.split(','):
            tag_dict += {'name': tag, 'slug': self._slugify_tags(tag)},
        return tag_dict
    url_tags = property(_url_tags)

    # convert multi word spaces to hyphens
    # in my intl version, i also remove accents from the words here
    def _slugify_tags(self, tag_string):
        return re.sub(r'(?<!,)\s+', '-', tag_string.strip())

Original comment by samuel.a...@gmail.com on 8 Jun 2007 at 3:27

GoogleCodeExporter commented 9 years ago
Jonathan, what you are describing is tag synonyms. You will find on Flickr that 
you
can have a "testtest" tag if you put it on a different photo.  It will be a 
synonym
with "test test" (and "t-e-s-t-t-e-s-t"), Flickr just doesn't allow the 
multiple on
one photo.

Multi-word tags can be implemented without implementing synonyms simply by 
making
them reliable to create and edit.  This would make "testtest", "test test" and
"t-e-s-t-t-e-s-t" all separate tags.

Tagging systems support synonyms to make things more findable while allowing 
users
freedom in how they express their tags.  This complicates things quite a bit.  
The
idea is to preserve how users enter tags, while unifying them in tag queries.  

Synonyms for Valentine's Day users on Flickr have entered include 
"valentines-day",
"valentine's day" and "valentinesday." The slug is "valentinesday" and the 
non-user
specific display name is also "valentinesday" (no reason the slug and display 
name
have to be the same).    See http://www.flickr.com/photos/tags/valentinesday/ 
and
http://www.flickr.com/photos/anniebby/1030070931/ (which I found through the 
first link).

Original comment by craig....@gmail.com on 7 Aug 2007 at 10:01

GoogleCodeExporter commented 9 years ago
I am also very interested in multiple-word tags and would really like them
implemented in this library.

It just seems so very unnatural to me to have to think of how to match multiple 
words
properly together using quotes, dashes and underscores when all the world is 
happily
using colons to enumerate things - all the world except geeks that is. So I'll 
wait
for some multi word tagging implementation before I use this on my site.

I am sorry if I seem a bit annoyed about this. I didn't want to insult anyone, 
it is
just sad to keep seeing this ",',-,_ debates and it feels like aliens are 
invading
us, taking our beautiful language away from us. Shakespeare, Dickens, 
Wordsworth and
others must be having some hard time using the tagging software.

Original comment by peter.k...@gmail.com on 16 Aug 2007 at 5:22

GoogleCodeExporter commented 9 years ago
i would actually prefer having more than one input field - one for each tag 
(like
google's labels) - or only allowing the user to add one tag at a time (with 
'enter'
sending an AJAX request and lets the user enter another one) - no need to find 
some
weird separator character

how about letting get_tag_name_list simply return the tag_names if it already 
is a
list? this way callers have full flexibility (although they have to normalize 
the
tags on their own if they need to) but can use the nice simple update_tags 
method

Original comment by herbert....@gmail.com on 21 Aug 2007 at 9:25

Attachments:

GoogleCodeExporter commented 9 years ago
just curious about the time frame of something like this making it into the 
code. I'm sure I can hack together 
something that allows multi-word tags, but I would love to be able to stick 
with the truck of django-tagging for 
easy updatability.

Original comment by drack...@gmail.com on 26 Oct 2007 at 9:54

GoogleCodeExporter commented 9 years ago
Just a minor nit on the above comments - any separator 'setting' should be at 
the
field level, right?

Global settings will bite you when the time comes to integrate different 
applications
in the same django site (real world case: admin app vs existing user-visible 
app).

Original comment by tobu...@gmail.com on 26 Oct 2007 at 10:07

GoogleCodeExporter commented 9 years ago

Original comment by jonathan.buchanan on 12 Jan 2008 at 1:14

GoogleCodeExporter commented 9 years ago
Added in revision 114 - nice and simple to start with. Double quotes and commas 
work
as you'd (hopefully) expect and anything goes for tag contents (for now).

Original comment by jonathan.buchanan on 12 Jan 2008 at 2:20