conda / conda-package-streaming

An efficient library to read from new and old format .conda and .tar.bz2 conda packages.
https://conda.github.io/conda-package-streaming/
Other
10 stars 9 forks source link

Extracting packages may fail with non-ASCII characters in file name #93

Closed marcoesters closed 2 months ago

marcoesters commented 3 months ago

Checklist

What happened?

Initially reported here: https://github.com/ContinuumIO/anaconda-issues/issues/13407

Without environment variables set to determine the encoding on Unix systems, conda-package-streaming may fail to extract packages. See the linked issue for details.

Additional Context

This came up during the installation of the sphinx package with the Anaconda 2024.06 installer on Ubuntu 24.04.

sphinx-7.3.7-py312h5eee18b_0.conda and possibly other versions contain a test file with an umlaut. If the default encoding is ascii, this will fail to extract.

Instead, we should pass encoding="utf-8" to TarFile and expose the encoding parameter to our API.