jceb / vim-orgmode

Text outlining and task management for Vim based on Emacs' Org-Mode
http://www.vim.org/scripts/script.php?script_id=3642
Other
3.12k stars 266 forks source link

Problem with dates and UTF-8 strings #230

Closed Draiken closed 8 years ago

Draiken commented 8 years ago

Hi there!

I started using the plugin and soon got an error when attempting to get a week agenda view. The vim calendar dates are generated in my locale (pt_BR in this case) so saturday string representation is "Sáb"

This caused errors on the highlighting (didn't recognize as a date) and didn't pick up those dates on the agenda view.

My first attempts were to just change the string to "Sat" for example, but that didn't help. Upon further investigation I found the problem. When using strftime on the OrgDate classes it does not support utf-8 strings on date representations. I'm not a python expert so googling around apparently this is a known issue from some standard libraries.

I have managed to build a failing test and solution specifically my agenda view. But I suspect this problem would creep up anywhere where strftime is used.

Here is the error trace:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/felipe/.vim/bundle/vim-orgmode/ftplugin/orgmode/plugins/Agenda.py
", line 143, in list_next_week
    cls.list_next_week_for(agenda_documents)
  File "/home/felipe/.vim/bundle/vim-orgmode/ftplugin/orgmode/plugins/Agenda.py
", line 175, in list_next_week_for
    if unicode(h.active_date)[1:11] != unicode(last_date)[1:11]:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 13: ordina
l not in range(128)

Here is the failing test: https://gist.github.com/Draiken/6c8cfe330d2280d11fd68260c83b7cd2 Of course, this requires the utf8 locale to be installed on the base system... which is not great at all.

The solution I tried was to convert the format to utf-8 then decode it back again. This is on the OrgDate class. This made the test pass

def __unicode__(self):
  u"""
  Return a string representation.
  """
  if self.active:
    return self.strftime(u'<%Y-%m-%d %a>'.encode(u'utf-8')).decode(u'utf-8')
  else:
    return self.strftime(u'[%Y-%m-%d %a]'.encode(u'utf-8')).decode(u'utf-8')

I'm neither a python programmer or familiar with the orgmode codebase but I'd be glad to build a pull request if you can point me in the right direction.

I see two solutions:

Thanks!

Ron89 commented 8 years ago

Thanks for your time on debugging. Since many languages use non-ascII letters, I believe uniformly use utf-8 string in our plugin is better. If strftime is known to cause issues when using decoded string. I think we should wrap it around like you did. Feeding it encoded string then decode the returned string back.

If your solution already works, mind sending in a pull request?

Draiken commented 8 years ago

I'll create separate pull requests for a few fixes I did:

On Fri, May 6, 2016, 12:31 AM HE Chong notifications@github.com wrote:

Thanks for your time on debugging. Since many languages use non-ascII letters, I believe uniformly use utf-8 string in our plugin is better. If strftime is known to cause issues when using decoded string. I think we should wrap it around like you did. Feeding it encoded string then decode the returned string back.

If your solution already works, mind sending in a pull request?

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jceb/vim-orgmode/issues/230#issuecomment-217340211

Ron89 commented 8 years ago

Pull request merged.