thoth-station / integration-tests

Integration tests for the Thoth project to make sure deployment works as expected
GNU General Public License v3.0
4 stars 12 forks source link

No torchvision successfully solved #259

Closed fridex closed 2 years ago

fridex commented 2 years ago

Describe the bug

It looks like no torchvision release from https://download.pytorch.org/whl/cpu was solved successfully:

2022-02-22 14:56:37,895  22 INFO     thoth.adviser.prescription.v1.unit:949: thoth.TorchvisionCPUIndex: Using torchvision releases from Torch CPU Python Package index
2022-02-22 14:56:37,903  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.11.2+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:37,931  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.11.2', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:37,939  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.11.1+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:37,948  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.11.1', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:37,959  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.11.0+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:37,969  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.11.0', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:37,980  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.10.1+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:37,992  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.10.1', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,004  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.10.0+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,028  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.10.0', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,040  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.9.1+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,051  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.9.1', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,062  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.9.0+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,072  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.9.0', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,081  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.8.2+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,093  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.8.2', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,102  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.8.1+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,111  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.8.1', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,121  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.8.0', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,132  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.7.0+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,141  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.7.0', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,153  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.6.1+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,163  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.6.1', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,174  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.6.0+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,184  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.6.0', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,196  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.5.0+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,204  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.5.0', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,213  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.4.2+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,222  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.4.2', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,230  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.4.1.post2', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,237  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.4.1+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,245  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.4.1', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,253  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.4.0+cpu', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,261  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.4.0', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,272  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.3.0', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error
2022-02-22 14:56:38,279  22 WARNING  thoth.adviser.sieves.solved:127: Removing package ('torchvision', '0.1.6', 'https://download.pytorch.org/whl/cpu') due to installation time error in the software environment - see https://thoth-station.ninja/j/install_error

To Reproduce Steps to reproduce the behavior:

  1. See integration-tests for https://github.com/thoth-station/ps-cv

Expected behavior Integration tests should succeed.

fridex commented 2 years ago

Using UBI 8 Python 3.8:

https://github.com/thoth-station/ps-cv/blob/2925d451b4ee17b4126e0fd61e5ce252303ce291/.thoth.yaml#L14-L19

fridex commented 2 years ago

Tried to debug the root cause:

podman run -it --rm --entrypoint bash quay.io/thoth-station/solver-rhel-8-py38:v1.11.0
 thoth-solver --verbose python -r "torchvision" -i https://download.pytorch.org/whl/cpu --no-transitive -d https://pulp.operate-first.cloud/pypi/gym-donkeycar/simple,https://pypi.org/simple,https://download.pytorch.org/whl/cpu

Ends with the following error:

Traceback (most recent call last):
  File "/opt/app-root/bin/thoth-solver", line 8, in <module>
    sys.exit(cli())
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/solver/cli.py", line 161, in python
    result = resolve_python(
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/solver/python/python.py", line 447, in resolve
    solver_result = _do_resolve_index(
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/solver/python/python.py", line 342, in _do_resolve_index
    _fill_hashes(source, package_name, package_version, extracted_metadata)
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/solver/python/python.py", line 224, in _fill_hashes
    package_hashes = source.get_package_hashes(package_name, package_version)
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/source.py", line 428, in get_package_hashes
    artifacts = self.get_package_artifacts(package_name, package_version)
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/source.py", line 356, in get_package_artifacts
    to_return.append(Artifact(artifact_name, artifact_url, verify_ssl=self.verify_ssl))
  File "<attrs generated init thoth.python.artifact.Artifact>", line 8, in __init__
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/artifact.py", line 48, in __attrs_post_init__
    self.sha = self._calculate_sha()
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/artifact.py", line 96, in _calculate_sha
    self._download_if_necessary()
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/artifact.py", line 52, in _download_if_necessary
    self._download_artifact()
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/artifact.py", line 62, in _download_artifact
    response.raise_for_status()
  File "/opt/app-root/lib64/python3.8/site-packages/requests/models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://download.pytorch.org/whl/cpu/torchvision/torchvision-0.6.1%2Bcpu-cp35-cp35m-linux_x86_64.whl
fridex commented 2 years ago

It looks like the torchvision index does not provide links following the package name in the path:

https://download.pytorch.org/whl/cpu/torchvision-0.6.1%2Bcpu-cp35-cp35m-linux_x86_64.whl -- works https://download.pytorch.org/whl/cpu/torchvision/torchvision-0.6.1%2Bcpu-cp35-cp35m-linux_x86_64.whl -- does not work

See missing torchvision in the path.

fridex commented 2 years ago

PEP-503 does not explicitly talk about the location, but given the following sentence:

URLs may be either absolute or relative as long as they point to the correct location.

The actual artifact location is not specified so constructing artifact URL based on the artifact name is not correct. We will need to fix this on our side.

fridex commented 2 years ago

The fix in https://github.com/thoth-station/python/pull/457 now correctly checks torch index and parses packages present there.

It looks like there is another issue. Checking the refresh job:

2022-02-25 08:00:09,929   1 INFO     thoth.graph_refresh_job:211: Published message for solver 'solver-rhel-8-py38' for package 'torchvision' in version '0.11.3+cpu' from index 'https://download.pytorch.org/whl/cpu'

It correctly schedules solving 'torchvision' in version '0.11.3+cu111' from index 'https://download.pytorch.org/whl/cu111'. The solver scheduled fails:

{
          "command": "/opt/app-root/src/solver-venv/bin/python3 -m pip install --force-reinstall --no-cache-dir --no-deps torchvision===0.11.3 --index-url \"https://download.pytorch.org/whl/cpu\"  --trusted-host download.pytorch.org",
          "message": "Command exited with non-zero status code (1): ERROR: Could not find a version that satisfies the requirement torchvision===0.11.3 (from versions: 0.1.6, 0.2.0, 0.5.0+cpu, 0.6.0+cpu, 0.6.1+cpu, 0.7.0+cpu, 0.8.0, 0.8.1+cpu, 0.8.2+cpu, 0.9.0+cpu, 0.9.1+cpu, 0.10.0+cpu, 0.10.1+cpu, 0.11.0+cpu, 0.11.1+cpu, 0.11.2+cpu, 0.11.3+cpu)\nERROR: No matching distribution found for torchvision===0.11.3\n",
          "return_code": 1,
          "stderr": "ERROR: Could not find a version that satisfies the requirement torchvision===0.11.3 (from versions: 0.1.6, 0.2.0, 0.5.0+cpu, 0.6.0+cpu, 0.6.1+cpu, 0.7.0+cpu, 0.8.0, 0.8.1+cpu, 0.8.2+cpu, 0.9.0+cpu, 0.9.1+cpu, 0.10.0+cpu, 0.10.1+cpu, 0.11.0+cpu, 0.11.1+cpu, 0.11.2+cpu, 0.11.3+cpu)\nERROR: No matching distribution found for torchvision===0.11.3\n",
          "stdout": "Looking in indexes: https://download.pytorch.org/whl/cpu\n",
          "timeout": 60
        }

Mind the missing +cpu part that is somehow lost along the way. I'm not sure if I tracked the right solver run, worth debugging.

EDIT: solver document solver-rhel-8-py38-220225000014-42610d0611204b8d

fridex commented 2 years ago

Still spotted these issues:

2022-02-28 12:07:57,061  19 INFO     thoth.common:366: Logging to rsyslog endpoint is turned off
2022-02-28 12:07:57,156  19 INFO     thoth.solver:66: Thoth Dependency Solver v1.11.1
2022-02-28 12:08:01,164  19 INFO     thoth.solver.python.python:263: Resolving package 'torchaudio' with version specifier '===0.8.0' from 'https://download.pytorch.org/whl/cpu'
2022-02-28 12:08:01,560  19 INFO     thoth.solver.python.python:288: Adding package 'torchaudio' in version '0.8.0' for solving
2022-02-28 12:08:01,561  19 INFO     thoth.solver.python.python:295: Using index 'https://download.pytorch.org/whl/cpu' to discover package 'torchaudio' in version '0.8.0'
2022-02-28 12:08:27,259  19 CRITICAL root:105: Traceback (most recent call last):
  File "thoth/solver/cli.py", line 184, in <module>
    cli()
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/opt/app-root/lib64/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "thoth/solver/cli.py", line 161, in python
    result = resolve_python(
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/solver/python/python.py", line 450, in resolve
    solver_result = _do_resolve_index(
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/solver/python/python.py", line 345, in _do_resolve_index
    _fill_hashes(source, package_name, package_version, extracted_metadata)
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/solver/python/python.py", line 224, in _fill_hashes
    package_hashes = source.get_package_hashes(package_name, package_version)
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/source.py", line 431, in get_package_hashes
    artifacts = self.get_package_artifacts(package_name, package_version)
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/source.py", line 359, in get_package_artifacts
    to_return.append(Artifact(artifact_name, artifact_url, verify_ssl=self.verify_ssl))
  File "<attrs generated init thoth.python.artifact.Artifact>", line 8, in __init__
    self.__attrs_post_init__()
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/artifact.py", line 48, in __attrs_post_init__
    self.sha = self._calculate_sha()
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/artifact.py", line 96, in _calculate_sha
    self._download_if_necessary()
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/artifact.py", line 52, in _download_if_necessary
    self._download_artifact()
  File "/opt/app-root/lib64/python3.8/site-packages/thoth/python/artifact.py", line 62, in _download_artifact
    response.raise_for_status()
  File "/opt/app-root/lib64/python3.8/site-packages/requests/models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://download.pytorch.org/whl/cpu/torchaudio/torchaudio-0.8.0-cp36-cp36m-linux_x86_64.whl

It looks like we need to release new thoth-solver with https://github.com/thoth-station/python/pull/457

fridex commented 2 years ago

The linked fix addressed the issue. Let's keep this open until the next integration-tests report which should confirm the releases hosted on the index are available (I tested manually, it looks like it works as expected).

fridex commented 2 years ago

/lifecycle active /assign @fridex /priority critical-urgent /sig stack-guidance

codificat commented 2 years ago

/triage accepted

fridex commented 2 years ago

I do not see this issue in the integration-tests report, also manually verified the resolver resolves application dependencies when torchvision is requested, see adviser-220307173904-ea0270530025dec1 in stage environment (correctly cross-index).

/close

sesheta commented 2 years ago

@fridex: Closing this issue.

In response to [this](https://github.com/thoth-station/integration-tests/issues/259#issuecomment-1060952478): >I do not see this issue in the integration-tests report, also manually verified the resolver resolves application dependencies when `torchvision` is requested, see `adviser-220307173904-ea0270530025dec1` in stage environment (correctly cross-index). > >/close > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.