cthoyt / zenodo-client

A tool for automated uploading and version management of scientific data to Zenodo
MIT License
31 stars 5 forks source link

Check that all filepaths exist before going through the upload #14

Open sgbaird opened 1 year ago

sgbaird commented 1 year ago

e.g.,

res = ensure_zenodo(
    key,
    data=data,
    paths=[
        f"../../data/processed/{task_name_underscore}/sobol_regression.csv",
        f"../../data/processed/{task_name_underscore}/model_metadata.json",
        f"../../models/{task_name_underscore}/surrogate_models.pkl",
        f"../../models/{task_name_underscore}/cv/cross_validation_models_0.pkl",
        f"../../models/{task_name_underscore}/cv/cross_validation_models_1.pkl",
        f"../../models/{task_name_underscore}/cv/cross_validation_models_2.pkl",
        f"../../models/{task_name_underscore}/cv/cross_validation_models_3.pkl",
        f"../../models/{task_name_underscore}/cv/cross_validation_models_4.pkl",
    ],
    sandbox=sandbox,  # remove this when you're ready to upload to real Zenodo
    access_token=access_token,
)

Many times I'm uploading large files, so it might take ~5-10 minutes before it throws the error.

sgbaird commented 1 year ago

Maybe something like the following (haven't tested):


from pathlib import Path
for fpath in fpaths:
    if not Path.exists(fpath):
        raise FileNotFoundError(fpath)
cthoyt commented 1 year ago

PR welcome! you can add your logic in in the following code between line 204 and 205, you can do an existence check for the file and either raise an error or just continue.

https://github.com/cthoyt/zenodo-client/blob/a97de70673d459095e2b0bfd8569ffa009a5d236/src/zenodo_client/api.py#L200-L214

cthoyt commented 1 year ago

@sgbaird any chance you can take a look at this? Should be as easy as putting

if not path.is_file():
    raise FileNotFoundError(path)

like you suggested between line 204 and 205