materialsproject / emmet

Be a master builder of databases of material properties. Avoid the Kragle.
https://materialsproject.github.io/emmet/
Other
52 stars 64 forks source link

Cli backup expansion #891

Closed tsmathis closed 9 months ago

tsmathis commented 9 months ago

This PR expands on the backup function of the emmet cli by:

  1. Modifying the logic for verifying the validity of block launchers that have already been archived
    • the default behavior of the --check flag remains unchanged, where the entire .tar archive for a block launcher will be checked as a single unit and hard fail if the verification fails
    • there is now an additional --exhaustive flag that will check the validity of each individual launcher in the .tar archive for a block and then only add launchers that pass verification into a "safe to remove" list which will then be cleaned up (if clean is active). This "soft fail" method 1) prevents --cleaning of any launchers that fail verification, while 2) allowing verification/cleaning to continue for launchers that do pass. Common failures have been due to local file inconsistencies on cfs, i.e., launcher was cleaned previously, user moved/modified files after initial archiving.

Example output from the new --exhaustive check when paired with --clean:

Screenshot 2023-11-15 at 12 01 00 PM
  1. Adding an additional --tar flag that compresses whatever is left in a block launcher on disk after archiving, checking, and cleaning. --tar is only ever invoked if --clean is also active.

This PR also adds a new tasks functionality that supplements the backup functionality of the cli: survey

survey is mainly intended for use as a discovery tool to avoid manual searching. For example, when given a user's directory for archival where we have no idea about the contents (for MP, e.g., NERSC user that left years ago and we have nothing to go on).

Small example output from survey:

Screenshot 2023-11-15 at 11 44 14 AM
codecov-commenter commented 9 months ago

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (1264493) 78.91% compared to head (ea2f21b) 91.29%. Report is 33 commits behind head on main.

Files Patch % Lines
emmet-core/emmet/core/vasp/calculation.py 80.00% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #891 +/- ## =========================================== + Coverage 78.91% 91.29% +12.37% =========================================== Files 75 138 +63 Lines 4217 12779 +8562 =========================================== + Hits 3328 11667 +8339 - Misses 889 1112 +223 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.