conda / conda-build

Commands and tools for building conda packages
https://docs.conda.io/projects/conda-build/
Other
380 stars 421 forks source link

Prefix replacement with ripgrep might fail on long paths on Windows #4357

Open jaimergp opened 2 years ago

jaimergp commented 2 years ago

Comes from https://github.com/conda-forge/staged-recipes/issues/17519#issue-1101991932

Actual Behavior

Prefix replacement fails with too long paths on Windows if ripgrep is used. The exception is not caught as it should because the subprocess is never launched:

INFO:conda_build.build:Packaging openfisca-france
Packaging openfisca-france
INFO conda_build.build:build(2289): Packaging openfisca-france
INFO:conda_build.build:Packaging openfisca-france-102.0.0-pyh6c4a22f_0
Packaging openfisca-france-102.0.0-pyh6c4a22f_0
INFO conda_build.build:bundle_conda(1529): Packaging openfisca-france-102.0.0-pyh6c4a22f_0
number of files: 2713
Fixing permissions
Packaged license file/s.
Traceback (most recent call last):
  File "D:\a\1\s\.ci_support\build_all.py", line 198, in <module>
    build_all(os.path.join(root_dir, "recipes"), args.arch)
  File "D:\a\1\s\.ci_support\build_all.py", line 101, in build_all
    build_folders(recipes_dir, folders, arch, channel_urls)
  File "D:\a\1\s\.ci_support\build_all.py", line 157, in build_folders
    conda_build.api.build([recipe], config=get_config(arch, channel_urls))
  File "C:\Miniconda\lib\site-packages\conda_build\api.py", line 186, in build
    return build_tree(
  File "C:\Miniconda\lib\site-packages\conda_build\build.py", line 3083, in build_tree
    packages_from_this = build(metadata, stats,
  File "C:\Miniconda\lib\site-packages\conda_build\build.py", line 2366, in build
    newly_built_packages = bundlers[pkg_type](output_d, m, env, stats)
  File "C:\Miniconda\lib\site-packages\conda_build\build.py", line 1672, in bundle_conda
    output['checksums'] = create_info_files(metadata, replacements, files, prefix=metadata.config.host_prefix)
  File "C:\Miniconda\lib\site-packages\conda_build\build.py", line 1285, in create_info_files
    files_with_prefix = get_files_with_prefix(m, replacements, files, prefix)
  File "C:\Miniconda\lib\site-packages\conda_build\build.py", line 949, in get_files_with_prefix
    pfx_matches = have_regex_files([f[2] for f in files_with_prefix], prefix=prefix,
  File "C:\Miniconda\lib\site-packages\conda_build\build.py", line 526, in have_regex_files
    match_records_rg = regex_files_rg(files, prefix, tag,
  File "C:\Miniconda\lib\site-packages\conda_build\build.py", line 324, in regex_files_rg
    raise e
  File "C:\Miniconda\lib\site-packages\conda_build\build.py", line 317, in regex_files_rg
    matches = subprocess.check_output(args, shell=False).rstrip(b'\n').split(b'\n')
  File "C:\Miniconda\lib\subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "C:\Miniconda\lib\subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Miniconda\lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Miniconda\lib\subprocess.py", line 1420, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 206] The filename or extension is too long

Expected Behavior

Long paths should be handled correctly or, if impossible, fallback safely to the Python implementation. Also, right now, this performance trick is either all-in or nothing. There's no logic to accept individual file errors on the ripgrep approach.

Uninstalling ripgrep or mangling rg path disables this code path and makes everything work at a lower performance.

Steps to Reproduce

Use this recipe.

Full CI logs: full_log.txt

Output of conda info
     active environment : base
    active env location : C:\Miniconda
            shell level : 1
       user config file : C:\Users\VssAdministrator\.condarc
 populated config files : C:\Miniconda\.condarc
                          C:\Users\VssAdministrator\.condarc
          conda version : 4.11.0
    conda-build version : 3.21.7
         python version : 3.9.9.final.0
       virtual packages : __win=0=0
                          __archspec=1=x86_64
       base environment : C:\Miniconda  (writable)
      conda av data dir : C:\Miniconda\etc\conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/conda-forge/win-64
                          https://conda.anaconda.org/conda-forge/noarch
          package cache : C:\Miniconda\pkgs
                          C:\Users\VssAdministrator\.conda\pkgs
                          C:\Users\VssAdministrator\AppData\Local\conda\conda\pkgs
       envs directories : C:\Miniconda\envs
                          C:\Users\VssAdministrator\.conda\envs
                          C:\Users\VssAdministrator\AppData\Local\conda\conda\envs
               platform : win-64
             user-agent : conda/4.11.0 requests/2.27.1 CPython/3.9.9 Windows/10 Windows/10.0.17763
          administrator : True
             netrc file : None
           offline mode : False

==> C:\Miniconda\.condarc <==
aggressive_update_packages:
  - ca-certificates
  - certifi
channels:
  - conda-forge
show_channel_urls: True

==> C:\Users\VssAdministrator\.condarc <==
add_pip_as_python_dependency: False
auto_update_conda: False
channel_priority: strict
channels:
  - conda-forge
show_channel_urls: True

==> envvars <==
bld_path: C:\\bld\\
jakirkham commented 2 years ago

There's also a more general issue with Windows path lengths. Please see issue ( https://github.com/conda/conda/issues/7203 ).

github-actions[bot] commented 1 year ago

Hi there, thank you for your contribution!

This issue has been automatically marked as stale because it has not had recent activity. It will be closed automatically if no further activity occurs.

If you would like this issue to remain open please:

  1. Verify that you can still reproduce the issue at hand
  2. Comment that the issue is still reproducible and include:
    • What OS and version you reproduced the issue on
    • What steps you followed to reproduce the issue

NOTE: If this issue was closed prematurely, please leave a comment.

Thanks!

bollwyvl commented 11 months ago

This is still an issue today, and can be seen on this PR, which fails to ripgrep (at least) this chestnut:

C:/bld/verapdf_1697427442234/_h_env/vpdf/profiles/veraPDF-validation-profiles-integration/PDF_A/2u/6.2 Graphics/6.2.11 Fonts/6.2.11.7 Unicode character maps/6.2.11.7.2 Level A and Level U conformance/verapdf-profile-6-2-11-7-2-t02.xml

With this error, now exposed more fully with conda-forge's more-verbose logging setup:

Resource usage statistics from building verapdf:
   Process count: 6
   CPU time: Sys=0:00:21.8, User=0:01:30.0
   Memory: 539.1M
   Disk usage: 70.4M
   Time elapsed: 0:03:39.0

INFO:conda_build.build:Packaging verapdf
Packaging verapdf
INFO:conda_build.build:Packaging verapdf-1.25.73-h57928b3_0
Packaging verapdf-1.25.73-h57928b3_0
number of files: 538
Fixing permissions
Packaged license file/s.
Traceback (most recent call last):
  File "C:\Miniforge\Scripts\conda-mambabuild-script.py", line 9, in <module>
    sys.exit(main())
  File "C:\Miniforge\lib\site-packages\boa\cli\mambabuild.py", line 256, in main
    call_conda_build(action, config)
  File "C:\Miniforge\lib\site-packages\boa\cli\mambabuild.py", line 228, in call_conda_build
    result = api.build(
  File "C:\Miniforge\lib\site-packages\conda_build\api.py", line 253, in build
    return build_tree(
  File "C:\Miniforge\lib\site-packages\conda_build\build.py", line 3799, in build_tree
    packages_from_this = build(
  File "C:\Miniforge\lib\site-packages\conda_build\build.py", line 2875, in build
    newly_built_packages = bundlers[pkg_type](output_d, m, env, stats)
  File "C:\Miniforge\lib\site-packages\conda_build\build.py", line 2017, in bundle_conda
    output["checksums"] = create_info_files(
  File "C:\Miniforge\lib\site-packages\conda_build\build.py", line 1553, in create_info_files
    files_with_prefix = get_files_with_prefix(m, replacements, files, prefix)
  File "C:\Miniforge\lib\site-packages\conda_build\build.py", line 1171, in get_files_with_prefix
    pfx_matches = have_regex_files(
  File "C:\Miniforge\lib\site-packages\conda_build\build.py", line 653, in have_regex_files
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Miniforge\lib\subprocess.py", line 1456, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 206] The filename or extension is too long

Presumably, when found to be on windows, regex_files_rg could prepend r"\\?\" to any paths, as is done in a number of other libraries' test tools, such as pytest.

def ensure_extended_length_path(path: Path) -> Path:
    """Get the extended-length version of a path (Windows).

    On Windows, by default, the maximum length of a path (MAX_PATH) is 260
    characters, and operations on paths longer than that fail. But it is possible
    to overcome this by converting the path to "extended-length" form before
    performing the operation:
    https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#maximum-path-length-limitation

    On Windows, this function returns the extended-length absolute version of path.
    On other platforms it returns path unchanged.
    """
    if sys.platform.startswith("win32"):
        path = path.resolve()
        path = Path(get_extended_length_path_str(str(path)))
    return path

As there are ... many different ways subprocess.(run|call|check_output|Popen) are invoked, this could require a rather lot of changes to handle uniformly... and would still break in a number of downstream user cases where users do not/cannot apply the registry hack, or forget to use this kind-of-arcane construction.

jakirkham commented 1 month ago

So ripgrep did add a fix ( https://github.com/BurntSushi/ripgrep/commit/db6bb21a629d5b1ec1bfe89c693b280497c9eedc ) as part of the 14.0.0 release last year

Might be worth checking if issues that still see this are using a recent version of ripgrep. If so, we may need a new issue upstream