UAL-RE / ReBACH

Python-based tool to enable data preservation to a cloud-hosted storage solution
MIT License
2 stars 2 forks source link

Bug: Bag gets uploaded despite folder validation failed #89

Closed zoidy closed 10 months ago

zoidy commented 10 months ago

Is there an existing issue for this?

Description of the bug

Sometimes, a bag is created and uploaded despite the preservation package folder validation failing.

Steps To Reproduce

On staging, article id 7873476

2023-10-27 19:16:44,205:INFO: ------- Processing article 7873476 version 2.
2023-10-27 19:16:44,205:INFO: Pre-processing script finished successfully.
2023-10-27 19:16:44,226:INFO: Checking if /opt/redata/mnt/preservation_staging/7873476_XXXXXXXX exists.
2023-10-27 19:16:44,267:INFO: Exists and is not empty, checking contents.
2023-10-27 19:16:44,329:INFO: Comparing Figshare file hashes against existing local files.
2023-10-27 19:16:44,371:INFO: /830051024_salaries-ipeds.csv file exists (hash match).
2023-10-27 19:16:44,391:ERROR: /opt/redata/mnt/preservation_staging/7873476_v02_XXXXXXX/v02/DATA/830054620_salaries-ipeds.csv does not exist.
2023-10-27 19:16:44,392:ERROR: Validation failed, deleting /opt/redata/mnt/preservation_staging/7873476_v02_XXXXXXX.
2023-10-27 19:16:44,532:INFO: /opt/redata/mnt/preservation_staging/7873476_v02_XXXXXXX deleted due to failed validations.
2023-10-27 19:16:44,533:INFO: Checking required files exist in associated curation folder /opt/redata/mnt/curation_testing/4.Published/.
2023-10-27 19:16:44,533:INFO: Curation files exist. Continuing execution.
2023-10-27 19:16:44,534:INFO: Checking and creating empty directories in preservation storage.
2023-10-27 19:16:44,902:INFO: Copying curation UAL_RDM files to preservation UAL_RDM folder.
2023-10-27 19:16:46,005:INFO: Copied curation files to preservation folder.
2023-10-27 19:16:46,005:INFO: Saving json in metadata folder for each version.
2023-10-27 19:16:46,172:INFO: Config file: bagger/config/default.toml
2023-10-27 19:16:46,172:INFO: Overriding bagger log file location logs with /opt/redata/mnt/logs from ReBACH Env file
2023-10-27 19:16:46,173:INFO: Processing preservation package '7873476_v02_XXXXXXX'
19:16:47 -     INFO: Job succeeded: 7873476_v02_XXXXXXXX.tar
2023-10-27 19:16:47,984:INFO: Status: SUCCESS.
2023-10-27 19:16:47,984:INFO: Exit code: 0.
2023-10-27 19:16:47,984:INFO: Preservation package '7873476_v02_XXXXXX' processed successfully

After the error, the files were deleted so the files should be redownloaded. However, no attempt is made to redownload the files resulting in an incorrect bag with missing files, despite the log showing a successful bag upload.