Exception is thrown when using non-ascii challenge/respone

ojengwa / django-simple-captcha

Automatically exported from code.google.com/p/django-simple-captcha

MIT License

1 stars 0 forks source link

Exception is thrown when using non-ascii challenge/respone #1

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. Create your own challenge generator
2. Return a non-ascii challenge/response pair

What is the expected output? What do you see instead?
Expected: A captcha with some non-ascii characters.
See: Exception: 'ascii' codec can't encode characters in position 0-7:
ordinal not in range(128)

What version of the product are you using? On what operating system?
trunk

Please provide any additional information below.
A small patch would fix it and hopefully won't break anything else :)

Original issue reported on code.google.com by miz...@gmail.com on 6 Feb 2009 at 3:01

Attachments:

django-simple-captcha-utf8.patch

GoogleCodeExporter commented 9 years ago

I don't think this solves the problem. I using a brazilian dictionary to make 
the
challenge words and I don't want no alpha numeric characters to be in the word, 
the
symbols can be confused with the noise.

I think the sollution is to change the word_challenge function and make it 
discard
these non ascii characters or rebuild the work until it is only alphanumeric. 
What
about the code above:

def valid_word_challenge():
    fd = file(settings.CAPTCHA_WORDS_DICTIONARY,'rb')
    l = fd.readlines()
    pos = random.randint(0,len(l))
    fd.close()
    word = l[pos].strip()
    if word.isalnum():
        return word
    else:
        return valid_word_challenge()

def word_challenge():
    word = valid_word_challenge()
    return word.upper(), word.lower()

Original comment by miche...@gmail.com on 1 Apr 2009 at 8:27

GoogleCodeExporter commented 9 years ago

I don't think yours is the same issue. Although I see what you mean :) I'd 
suggest
you to create a custom generator to read a word list and strip whatever you feel
needs to be stripped. In this case you don't even need my patch.
Latin isn't the only alphabet around, that's why we should use unicode and 
utf-8.

Original comment by miz...@gmail.com on 1 Apr 2009 at 9:22

GoogleCodeExporter commented 9 years ago

Hi mizish,

You are right! After comment on you patch, I see that my solution is to have my
custom dictionary with only ascii words or, as you suggested, a custom 
generator.

Sorry about that ;)

Original comment by miche...@gmail.com on 2 Apr 2009 at 10:49

GoogleCodeExporter commented 9 years ago

Sorry for the long overdue reply on this.

I just checked and couldn't reproduce this issue, using the following challenge:

def unicode_challenge():
    chars,ret = u'äàáëéèïíîöóòüúù', u''
    for i in range(4):
        ret += chars[random.randint(0,len(chars)-1)]
    return ret.upper(), ret

Care to test that, and maybe post your own challenge function?

Also, where exactly is the exception thrown?

Original comment by mbonetti on 27 Apr 2009 at 2:11

GoogleCodeExporter commented 9 years ago

Well, I'm back to the project where I use django-simple-captcha.

Exception is also thrown with the function you provided.

  File "../captcha/models.py", line 24, in save
    self.hashkey = sha.new(str(self.challenge) + str(self.response)).hexdigest()
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: 
ordinal not in range(128)

You are not having this exception probably because your python and/or system 
setup is different (I use 
python 2.4 on RHEL5) and you have utf-8 as a default codec. Instead of str() 
you should use 
django.utils.encoding.smart_str(). This is exactly what django documentation 
suggests.

Original comment by miz...@gmail.com on 24 Sep 2009 at 8:28

GoogleCodeExporter commented 9 years ago

Okay, this should be fixed as of r41.

Original comment by mbonetti on 8 Dec 2009 at 2:55

Changed state: Fixed