nextstrain / nextclade_data

Datasets for https://github.com/nextstrain/nextclade
https://clades.nextstrain.org
31 stars 25 forks source link

`./scripts/rebuild` fails when run with new dataset as "FileNotFoundError: [Errno 2] .../nextclade_data/data_temp/nextstrain__ebola__zaire__unreleased.zip'" #183

Closed corneliusroemer closed 6 months ago

corneliusroemer commented 6 months ago

Trying to build data_output locally using ./scripts/rebuild.

I pip installed repro_zipfile

But there seems to be an issue somewhere when the output zipfile doesn't exist yet:

❯ ./scripts/rebuild --input-dir ./data --output-dir ./data_output --allow-dirty && serve -l 3000 --cors ./data_output
INFO: :Adding '.dataset_order' entries to 'collection.json' for the following datasets: 'nextstrain/ebola/zaire'. Please reorder them manually as needed. This order is used when displaying datasets of the collection in the user interface.
Traceback (most recent call last):
  File "/Users/corneliusromer/code/nextclade_data/./scripts/rebuild", line 513, in <module>
    main()
  File "/Users/corneliusromer/code/nextclade_data/./scripts/rebuild", line 248, in main
    collection, release_infos_for_dataset, refs = process_one_collection(
                                                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corneliusromer/code/nextclade_data/./scripts/rebuild", line 339, in process_one_collection
    release_infos = prepare_dataset_release_infos(args, datasets, datasets_from_index_json, collection_dir, tag,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/corneliusromer/code/nextclade_data/./scripts/rebuild", line 404, in prepare_dataset_release_infos
    create_dataset_package(args, dataset_new, path, tag, dataset_dir)
  File "/Users/corneliusromer/code/nextclade_data/./scripts/rebuild", line 496, in create_dataset_package
    make_zip(out_dir, zip_filename)
  File "/Users/corneliusromer/code/nextclade_data/scripts/lib/fs.py", line 80, in make_zip
    with ReproducibleZipFile(output_zip, "w") as z:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniforge/base/envs/py11/lib/python3.11/zipfile.py", line 1286, in __init__
    self.fp = io.open(file, filemode)
              ^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/Users/corneliusromer/code/nextclade_data/data_temp/nextstrain__ebola__zaire__unreleased.zip'
corneliusroemer commented 6 months ago

This fixes it locally for me: https://github.com/nextstrain/nextclade_data/pull/184/commits/1c312a300b63704a84c18f004861dcba2fb116c0#diff-d5f4f5bd26e03df0bb4c2793947f14867760abf4ead859f75ac14f54e3f12149R81