UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcb in position 24: invalid continuation byte

nick-youngblut commented 4 years ago

The stderr from fqtools seems to be a problem:

ret = script_runner.run('fqtools', '-h')
assert ret.stdout == 'OK'

Generates:

.# Running console script: fqtools -h
# Script return code: 1
# Script stdout:

# Script stderr:
Traceback (most recent call last):
  File "/ebio/abt3_projects/software/dev/miniconda3_dev/envs/MGSIM/lib/python3.6/site-packages/py/_path/common.py", line 171, in read
    return f.read()
  File "/ebio/abt3_projects/software/dev/miniconda3_dev/envs/MGSIM/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcb in position 24: invalid continuation byte

Running fqtools -h outside of pytest works just fine on that fastq file.

I'm using pytest-console-scripts 0.2.0 and fqtools 2.0 on Ubuntu 18.04.4

It seems pretty easy to generate a UnicodeDecodeError: 'utf-8' error with script_runner.run(). For instance, just running bash commands such as script_runner.run('pwd') will cause the same error.

kvas-it commented 4 years ago

Hey! Thanks for the reporting this.

script_runner.run() is not really intended for running things that are not Python scripts, but still this might indicate some more general problem, so I will take a look.

nick-youngblut commented 4 years ago

Thanks for the quick response! In this case, I'm running fqtools to validate fastq files generated by my python package. The error only seems to occur if I don't use --script-launch-mode=subprocess

kvas-it commented 4 years ago

You can just use subprocess.run() and related functions to do it. The benefit of pytest-console-scripts compared to subprocess is that it can run Python scripts inside of the same Python process (and that speeds the tests up quite a bit), but when you're running an external tool, that is not written in Python, that won't work anyway.

In any case, thanks for reporting. It seems like you might have hit some bug that also affects running Python scripts.

nick-youngblut commented 4 years ago

Good point. I'll just use subprocess

nathanpainchaud commented 4 years ago

I can confirm that I get a similar error when running Python scripts using inprocess for script_launch_mode. I want to use inprocess because I've got a big Conda environment that I don't want to have to recreate each time I test. From my understanding (and what I managed to test despite the bug), this mode calls the python interpreter active when launching pytest.

I managed to reproduce the issue with a very simple test, that goes like this:

def test_pytest_console_scripts(script_runner) -> None:
    ret = script_runner.run("python", "tests/unit/blank.py")
    assert ret.success

where blank.py is just a blank file (in UTF-8 encoding).

Running the above test, I get the following error:

tests/unit/tooling_test.py::test_pytest_console_scripts[inprocess] FAILED [100%]# Running console script: python tests/unit/blank.py
# Script return code: 1
# Script stdout:

# Script stderr:
Traceback (most recent call last):
  File "$HOME/opt/miniconda3/envs/CAV/lib/python3.8/site-packages/py/_path/common.py", line 177, in read
    return f.read()
  File "$HOME/opt/miniconda3/envs/CAV/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 25: invalid start byte

I'm configuring pytest using a pyproject.toml file that looks like this:

[tool.pytest.ini_options]
required_plugins = ["pytest-console-scripts"]
testpaths = ["tests"]
# Used by pytest-console-scripts plugin
# Ensures that scripts are launched using the current interpreter
# where all the dependencies of the project should be installed
script_launch_mode = "inprocess"

Any ideas on how to work around this issue? Should I just manually call my scripts (although I don't know of an alternative that has something equivalent to inprocess) until this issue is solved? Thanks in advance for the help!

nathanpainchaud commented 4 years ago

I managed to fix the error by changing the test from:

def test_pytest_console_scripts(script_runner) -> None:
    ret = script_runner.run("python", "tests/unit/blank.py")
    assert ret.success
to the following (basically calling the Python script directly instead of python the_script.py):
def test_pytest_console_scripts(script_runner) -> None:
    ret = script_runner.run("tests/unit/blank.py")
    assert ret.success
Still, I wonder if this is an expected behavior, or whether calling python and then the script should work as well (although it would not certainly not be the recommended way of doing things).

kvas-it commented 4 years ago

Hi @nathanpainchaud,

Thanks for reporting this and for the small reproducing example. Running python with pytest-console-scripts is not an intended usage. Calling the script file directly is also not intended (or at least I haven't used it this way :shrug:) but it does work, albeit only in inprocess mode (it might also work in subprocess mode if you make the script executable, but you probably don't need this anyway because it would be slow).

Orignally the scope of pytest-console-scripts was only about running console scripts declared with console_scripts entry point in setup.py. However, it does seem that your usage (running a module that is not a declared console script) would be reasonably easy to implement and I imagine some people would like to have this possibility. I'm open to implementing this and might do it once I find some time to fix the test suite failures (unfortunately quite a bit of hackery was required to make it possible to quickly put together a virtualenv with pytest, pytest-console-scripts and a small test module in it -- this is required for the tests of pytest-console-scripts -- and it doesn't work with latest setuptools changes).

I hope this clears it up a bit. Cheers, Vasily

kvas-it commented 4 years ago

After landing #35 non-executable Python scripts work both in in-process and subprocess modes. Running binaries doesn't work in inprocess mode and might cause weird errors (because trying to read those binaries as Python source code rightfully crashes). I think all legitimate use cases that were breaking with regards to this ticket are now working, so I'm closing it.

Let me know if you disagree and think that there's something else to fix here and we can re-open or create another ticket.

kvas-it / pytest-console-scripts

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcb in position 24: invalid continuation byte #33