tell-k / vim-autopep8

autopep8 plugin for Vim
http://www.vim.org/scripts/script.php?script_id=4614
MIT License
277 stars 51 forks source link

"'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)" #24

Open EloiZ opened 8 years ago

EloiZ commented 8 years ago

Hi, when i run vim-autopep8 plugin on a python script that contains an non-ascii character, my whole file get replaced by a traceback error. When i directly run autopep8 -i myfile.py, it works flawlessly. Any idea where the bug comes from? Thanks a lot

Python script: (note the non-ascii character é that creates the bug)

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
#ééé

if __name__=="__main__":
    print ""

The file gets replaced by:

Traceback (most recent call last):
  File "/usr/bin/autopep8", line 9, in <module>
    load_entry_point('autopep8==0.9.1', 'console_scripts', 'autopep8')()
  File "/usr/lib/python2.7/dist-packages/autopep8.py", line 2309, in main
    options))
  File "/usr/lib/python2.7/dist-packages/autopep8.py", line 1849, in fix_string
    return fix_lines(sio.readlines(), options=options)
  File "/usr/lib/python2.7/dist-packages/autopep8.py", line 1854, in fix_lines
    tmp_source = ''.join(normalize_line_endings(source_lines))
  File "/usr/lib/python2.7/dist-packages/autopep8.py", line 1820, in normalize_line_endings
    newline = find_newline(lines)
  File "/usr/lib/python2.7/dist-packages/autopep8.py", line 955, in find_newline
    if s.endswith(CRLF):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
tell-k commented 7 years ago

@EloiZ Sorry for my late reply.

In my environment I couldn't reproduce the error. Please try the latest version autopep8. Maybe The error was fixed.

pseyfert commented 6 years ago

I ran into this issue after upgrading from 1.0.7 to 1.1.1. Investigating …

EDIT: file said 1.0.6, but turns out it corresponds to git tag 1.0.7

tell-k commented 6 years ago

@pseyfert Please tell me the code that I can reproduce and your environment?

pseyfert commented 6 years ago

So my understanding is, this is due to the following change from 1.0.7 to 1.1.0:

In 1.0.7 the buffer gets written to a temporary file on which autopep8 is run.

In 1.0.1 autopep8 is used in the commandline mode, here (my version of) autopep8 encounters encoding problems with non-ascii characters.

So apparently autopep8 handles input encoding differently in command line mode and in file mode.

Reverting this in vim-autopep8 (at least the quick revert I did) will break the support for ranges, though I suspect the affected autopep8 versions anyhow don't support --range.

This is with autopep8 version 0.9.1 (python2). I tested with this file (We used 90 character line width, which autopep8 will wrap to 80 characters. I think the © symbol in the second line is enough to trigger the encoding failure.)

tell-k commented 6 years ago

@pseyfert

I could check the error. If you update autopep8 to v1.3.4, no error will occur. v0.9.1 is too old version. you should not use it.

pseyfert commented 6 years ago

Thanks for the quick feedback. I agree upgrading autopep8 is reasonable, v0.9.1 just is the debian stable and I was wondering why things still worked with vim-autopep8 v1.0.7.

Anyway, I tried with autopep8 v1.3.4 and the error somewhat persists.

I could actually reproduce it outside of vim with:

cat libfulltext/bin/get_fulltext.py | autopep8 -

(and the same with cat libfulltext/bin/get_fulltext.py | python3 $(which autopep8) - )

Do you have some global config that sets the python/autopep8 encoding to utf-8? (I'm wondering why it works for you but not for me.)

Looking at https://github.com/hhatto/autopep8/issues/148 I was successful with

cat libfulltext/bin/get_fulltext.py | PYTHONIOENCODING=utf-8 autopep8 -

Given the out-of-vim reproducer, this is probably an autopep issue8, but I'm wondering if vim-autopep8 can/could/should call autopep8 like that with encoding.

pseyfert commented 6 years ago

c28fc9aa6e33290b5e166cf44103c81b8a6ec2b3 fixes the behavior for me in a debian:experimental docker container (python2, autopep8 1.3.4). With autopep8 0.9.1 on stable, the issue persists (PYTHONIOENCODING apparently ignored), but I'm out of ideas for the old autope8 version for now.

tell-k commented 6 years ago

@pseyfert

Do you have some global config that sets the python/autopep8 encoding to utf-8?

No, I don't have any config for autopep8. Perhaps your locale setting is not right? Please show me your local information. I want to reproduce it.

$ locale
$ export | grep PYTHONIOENCODING 
pseyfert commented 6 years ago

I'm using this Dockerfile at the moment:

FROM debian:experimental
RUN apt-get update && apt-get -y upgrade && apt install -y git vim-autopep8
RUN git clone https://github.com/andrenarchy/libfulltext.git
CMD echo "locale:"; locale; echo "locale -a:"; locale -a; echo "environment:"; export | grep PYTHON; echo "autopep8:"; cat libfulltext/bin/get_fulltext.py  | autopep8 -

and get as output:

locale:
LANG=
LANGUAGE=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
locale -a:
C
C.UTF-8
POSIX
environment:
autopep8:
Traceback (most recent call last):
  File "/usr/bin/autopep8", line 11, in <module>
    load_entry_point('autopep8==1.3.4', 'console_scripts', 'autopep8')()
  File "/usr/lib/python2.7/dist-packages/autopep8.py", line 3905, in main
    fix_code(sys.stdin.read(), args, encoding=encoding))
  File "/usr/lib/python2.7/dist-packages/autopep8.py", line 3093, in fix_code
    source = source.decode(encoding or get_encoding())
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 35: ordinal not in range(128)

i.e. PYTHONIOENCODING is not set, locales are set to POSIX. Turns out, setting LC_ALL=C.UTF-8 also fixes the error.