chatnoir-eu / web-content-extraction-benchmark

Web Content Extraction Benchmark
Apache License 2.0
16 stars 5 forks source link

Fail at running the project #3

Closed LeDilam closed 4 months ago

LeDilam commented 4 months ago

I have try running the project, by trying both methods, and it always fails. Most probably for dependency or version issues. I had dependency issues with poetry itself (outside of the project) which I succeeded to solve. When I run : poetry install && poetry shell I get :

  RuntimeError

  The lock file is not compatible with the current version of Poetry.
  Upgrade Poetry to be able to read the lock file or, alternatively, regenerate the lock file with the `poetry lock` command.

  at /usr/lib/python3/dist-packages/poetry/packages/locker.py:481 in _get_lock_data
      477│                 "Upgrade Poetry to ensure the lock file is read properly or, alternatively, "
      478│                 "regenerate the lock file with the `poetry lock` command."
      479│             )
      480│         elif not lock_version_allowed:
    → 481│             raise RuntimeError(
      482│                 "The lock file is not compatible with the current version of Poetry.\n"
      483│                 "Upgrade Poetry to be able to read the lock file or, alternatively, "
      484│                 "regenerate the lock file with the `poetry lock` command."
      485│             )

When I try :

python3 -m venv venv
source venv/bin/activate
pip install .

I get for the last command :

Processing /home/dim/Documents/Serieux/Stages/M2/Stage/Outils/Database/web-content-extraction-benchmark
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [14 lines of output]
      Traceback (most recent call last):
        File "/home/dim/Documents/Serieux/Stages/M2/Stage/Outils/Database/web-content-extraction-benchmark/venv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 363, in <module>
          main()
        File "/home/dim/Documents/Serieux/Stages/M2/Stage/Outils/Database/web-content-extraction-benchmark/venv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 345, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/dim/Documents/Serieux/Stages/M2/Stage/Outils/Database/web-content-extraction-benchmark/venv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 164, in prepare_metadata_for_build_wheel
          return hook(metadata_directory, config_settings)
        File "/tmp/pip-build-env-ctqbwd3c/overlay/lib/python3.10/site-packages/poetry/core/masonry/api.py", line 42, in prepare_metadata_for_build_wheel
          poetry = Factory().create_poetry(Path(".").resolve(), with_groups=False)
        File "/tmp/pip-build-env-ctqbwd3c/overlay/lib/python3.10/site-packages/poetry/core/factory.py", line 60, in create_poetry
          raise RuntimeError("The Poetry configuration is invalid:\n" + message)
      RuntimeError: The Poetry configuration is invalid:
        - data.repository must be uri

      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I have try to remove the lock file and it did not fix the issue. I have also try different path for the repository variable in the pyproject.toml file.

I don't know how to solve the issue. Thank you for your help.

LeDilam commented 4 months ago

The issue is fixed. I don't know what was causing the problem. I have uninstall poetry (with sudo apt remove python3-poetry), remove all local changes of the project (one important issue was with one of the local changes), commented Dragnet and ExtractNet between lines 54 and 66 in pyproject.toml, use the pip install in venv and it worked.