explosion / weasel

🦦 weasel: A small and easy workflow system
MIT License
67 stars 7 forks source link

Error when using a file in a git repository as an asset in spacy projects config #66

Closed b2m closed 7 months ago

b2m commented 1 year ago

Using a file in a git repository as an asset in a spacy project configuration fails because the code expects only directories.

How to reproduce the behaviour

Use a file in a git repository as an asset in the projects.yml as described in the documentation:

path: Path of the file or directory to download, relative to the repo root.

Here is a code example trying to use spacy's citation file as an asset:

assets:
  - dest: "spacy-citation.cff"
    git:
      repo: "https://github.com/explosion/spaCy"
      branch: "master"
      path: "CITATION.cff"

Then run spacy project assets which fails with the following (reduced) stacktrace:

Traceback (most recent call last):
 ...
  File "/venv/lib/python3.8/site-packages/spacy/cli/project/assets.py", line 39, in project_assets_cli
    project_assets(
  File "/venv/lib/python3.8/site-packages/spacy/cli/project/assets.py", line 105, in project_assets
    git_checkout(
  File "/venv/lib/python3.8/site-packages/spacy/cli/_util.py", line 407, in git_checkout
    shutil.copytree(str(source_path), str(dest))
  File "/usr/lib/python3.8/shutil.py", line 555, in copytree
    with os.scandir(src) as itr:
NotADirectoryError: [Errno 20] Not a directory: '/tmp/tmpw_23425/CITATION.cff'

The reason for the error is that shutil.copytree expects a directory and not a single file.

Your Environment

danieldk commented 1 year ago

Thank you for reporting this issue! I can reproduce it and am currently working on a fix.

adrianeboyd commented 1 year ago

We would need to port https://github.com/explosion/spaCy/pull/12181 to weasel.