radiocosmology / alpenhorn

Alpenhorn is a service for managing an archive of scientific data.
MIT License
2 stars 1 forks source link

Rewrite 13/14: Lustre I/O #156

Closed ketiltrout closed 1 year ago

ketiltrout commented 1 year ago

This PR add the Lustre I/O classes.

LFS wrapper

I've made a wrapper around calls to the lfs(1) binary in alpenhorn.io.lfs.py. It is not a complete wrapper for that command and only able to run these commands:

LustreQuota I/O

This is a very simple stepping-stone I/O module to Lustre HSM I/O. It's exactly the same as DefaultIO, except is uses "lfs quota" to determine free space. This behaviour re-creates the special casing we have in place in alpenhorn-1 for the "cedar_online" node.

Lustre HSM I/O

The LustreHSM I/O is where all the HSM management occurs. In general this is via updated versions of I/O methods from Default I/O. It also will automatically release files from disk (see release_files) as needed to keep enough headroom available on the HSM disk so that I/O can continue to happen.

The LustreHSM Group needs two nodes in it. One of them must be a "LustreHSM" node, used as the "primary node", and another node, of any non-LustreHSM type, used as a "secondary node" for small files. The group I/O config may specify the threshold for small files.

ketiltrout commented 1 year ago

I've rewritten some docstrings.

It's good to have something that can test the file-latent parts of the I/O layer (i.e. ready/not ready). There's nothing actually Cedar or CHIME specific here anymore except maybe the name. Would it be better to rename this from Nearline to LustreHSM?

ketiltrout commented 1 year ago

I've renamed:

ljgray commented 1 year ago

Changes look good. I noticed one small typo, line 163 in auto_import, in ddb9e40 to 3bfbecb which I must have missed when reviewing PR 12/14. It just says copy.last_upate instead of copy.last_update.

ketiltrout commented 1 year ago

I think I added that after you reviewed it. More importantly, though, is why the test suite didn't catch it.

ketiltrout commented 1 year ago

Fixed and test-suite updated to catch such a thing

ljgray commented 1 year ago

Looks great!

ketiltrout commented 1 year ago

I've obsessed a tiny bit over the output from lfs quota and ended up both fixing the parsing of the output from that command, and also making the following changes:

Diff: https://github.com/radiocosmology/alpenhorn/compare/5bfa602..26696a292c3de6d9516d03cc03932ca54d79a467