opensciencegrid / xcache

Settings and configurations for an XRootD Caching Proxy
https://opensciencegrid.org/docs/data/stashcache/install-cache/
Apache License 2.0
4 stars 13 forks source link

Xcache consistency check #83

Closed ddavila0 closed 4 years ago

edquist commented 4 years ago

/var/lib seems like a strange place to store the DB. /var/cache/ seems more appropriate. Thoughts @matyasselmeci @edquist ?

/var/lib/<PACKAGE> is actually the normal place for (persistent) DBs to live. /var/cache is for temporary stuff that can get blown away (eg on reboot) without anybody getting upset.

brianhlin commented 4 years ago

/var/lib/ is actually the normal place for (persistent) DBs to live. /var/cache is for temporary stuff that can get blown away (eg on reboot) without anybody getting upset.

Doesn't the rpmdb live under /var/cache? I wouldn't imagine that this DB is more sacred than the rpmdb but @ddavila0 would be able to comment

matyasselmeci commented 4 years ago

No, rpmdb lives in /var/lib/rpm. The yum cache lives in /var/cache.

edquist commented 4 years ago

Doesn't the rpmdb live under /var/cache? I wouldn't imagine that this DB is more sacred than the rpmdb but @ddavila0 would be able to comment

/var/cache/yum is a cache of the repodata for external repos, which yum uses until they expire to avoid having to download the repodata from all the configured repos for every single yum command. But the point is, yum is able to regenerate this cache every time you invoke yum, if for instance you blew away the cache after each time you ran yum.

The rpm database, on the other hand, lives under /var/lib/rpm, and if it got wiped out, the system would have no way of reconstructing that database to know which packages were installed, etc.

FHS spec for /var/cache: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s05.html

edquist commented 4 years ago

The point of all this is that the designation between /var/lib and /var/cache is not that databases can only go in one and not the other, but what is the purpose of the database? If it is a temporary cache that can be blown away and regenerated every time the program is run, then /var/cache/<PACKAGE> is an appropriate location. If the contents of the database needs to persist between invocations of a program, then it belongs under /var/lib/<PACKAGE>.

ddavila0 commented 4 years ago

about the db, I think it would be better to keep it in /var/lib/. Even when the data in the db can be restored it could take weeks to do it. After the first execution, the state of all files on the cache is stored there and consecutive executions will only analyze files that have changed.

matyasselmeci commented 4 years ago

I don't think you should binary files into the Git repo. Here's what I suggest instead:

or, since you're doing that for multiple .whl files, you can decrease the repetition by doing

for whl in %{SOURCE1} %{SOURCE2} %{SOURCE3} ...; do
    pip2 install -I --no-deps "$whl" --root %{buildroot}/usr/lib/xcache-consistency-check
done

Does that make sense?

ddavila0 commented 4 years ago

@matyasselmeci I did the change of removing the wheel files from the repo and add them as Sources in the rpm but now travis cannot find them and fails. I guess I could add them back to the repo and mount the python-deps into the container so that travis can find them but then we would be back at the initial issue of having binaries in the repo, is there any way to workaround this?

matyasselmeci commented 4 years ago

@brianhlin , do your changes still need addressing, or can we merge this?

ddavila0 commented 4 years ago

@matyasselmeci , in the last meeting @brianhlin said that if you were ok with it we should merge it, didn't he?