HERA-Team / librarian

The HERA Librarian.
BSD 2-Clause "Simplified" License
6 stars 11 forks source link

Add "verify store" feature #14

Open pkgw opened 8 years ago

pkgw commented 8 years ago

I expect that we're going to want to be able to have the Librarian look through a store and validate its contents: that all of the FileInstances that it knows about are actually there, and (perhaps) that there aren't any files there that are not known FileInstances.

You could also imagine a feature to verify the MD5 sums of file instances, but that will be insanely CPU-heavy to calculate — should only be done for single instances at a time. If we were feeling fancy, we could validate the MD5 subs of flat files as we streamed that out to clients via the /stream/ HTTP handlers, although this won't work with directories (i.e. UV data) without some serious deep magic.

pkgw commented 7 years ago

Also, you could imagine teaching the Librarian to poke around attempting to verify files when the system load is low. Most Librarian usage is very bursty, so it's spending a lot of time sitting around doing nothing. Thus far, however, this has not felt like an urgent feature, though.

pkgw commented 7 years ago

From my efforts to quasi-sync data from qmaster to NRAO, it looks as if we have some examples of the Librarian and the disk getting out of sync. For instance, file zen.2457471.19477.yx.uvc.autos.png is supposed to have an instance on pot1, but it doesn't.