conda / conda-lock

Lightweight lockfile for conda environments
https://conda.github.io/conda-lock/
Other
489 stars 103 forks source link

UnicodeDecodeError exception in _invoke_conda #502

Open yonil7 opened 1 year ago

yonil7 commented 1 year ago

This line raise exception when p.stderr is not utf-8 or has non utf-8 codes in it.

Using conda-lock 2.2.0

Full error message:

INFO:root:
INFO:root:  Using cached gpustat-1.1.1.tar.gz (98 kB)
INFO:root:
INFO:root:  Installing build dependencies: started
INFO:root:
INFO:root:  Installing build dependencies: finished with status 'done'
INFO:root:
INFO:root:  Getting requirements to build wheel: started
INFO:root:
INFO:root:  Getting requirements to build wheel: finished with status 'error'
INFO:root:
INFO:root:
Traceback (most recent call last):
  File "C:\miniconda3\Scripts\conda-lock-script.py", line 9, in <module>
    sys.exit(main())
  File "C:\miniconda3\lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "C:\miniconda3\lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "C:\miniconda3\lib\site-packages\click\core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\miniconda3\lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\miniconda3\lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "C:\miniconda3\lib\site-packages\click\decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\miniconda3\lib\site-packages\conda_lock\conda_lock.py", line 1443, in install
    install_func(file=lockfile)
  File "C:\miniconda3\lib\site-packages\conda_lock\conda_lock.py", line 231, in do_conda_install
    _conda(["run"], ["pip", "install", "--no-deps", "-r", str(requirements_path)])
  File "C:\miniconda3\lib\site-packages\conda_lock\invoke_conda.py", line 124, in _invoke_conda
    for line in p.stderr:
  File "C:\miniconda3\lib\codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd7 in position 47: invalid continuation byte
maresb commented 1 year ago

Thanks for the report. Could you please share more details abot how to reproduce this? I'm wondering what could lead to non-unicode output.

yonil7 commented 1 year ago

The error seems to be because pip installation of gpustat-1.1.1 fails. (Getting requirements to build wheel: finished with status 'error') gpustat-1.1.1 is not listed in my env.yaml (conda-lock input file), but If I add gpustat-1.1.1 as explicit dependency to this file (from conda-forge) there is no error. Anyway, I think its safer not to assume p.stderr stream is unicode encoded and try to unicode decode it without expecting exception.

maresb commented 1 year ago

Could you please share your environment.yml? I still don't know how to reproduce this.

yonil7 commented 1 year ago

Unfortunately I cant. But as I said, its anyway safer for the code to not assume p.stderr stream is always unicode.

maresb commented 1 year ago

Ok, then I'll go out on a limb and invent a fictional scenario in which pip is running some setup.py script which is outputting non-utf8 characters. I'm inclined to say that in 2023, this is a bug with your setup.py script, and not the responsibility of conda-lock. That said, in my imaginary scenario you're personally completely fed up with this garbage legacy setup.py script which probably does all sorts of other horrible things. However, you and your organization are stuck with it for the foreseeable future.

So what would you like to have happen here? Perhaps the easiest solution would be to set errors="replace" in subprocess.Popen?

yonil7 commented 1 year ago

I agree this is probably a bug in gpustat setup.py (not sure) and I agree that any package setup.py should not output non-utf8 characters to their stderr/stdout streams. Replacing these unexpected non-utf8 characters as you suggested (using errors="replace") seems like a great solution for making conda-lock more bulletproof.

maresb commented 1 year ago

Great! Would you be able to submit a PR for this?

Rather than silently continue when a replacement occurs, it would be nice to detect it (perhaps by simply detecting the replacement character?) and complaining with a warning.