KIT-CMS / Excalibur

Analysis repository for Z+Jet studies
1 stars 4 forks source link

Caching of existing files #49

Open dhaitz opened 8 years ago

dhaitz commented 8 years ago

When lumi/pileup files are already present in the repository, the files are still queried as they are not in the local cache. Is this behaviour on purpose? IMHO if someone else has checked in the files it would be convenient to simply use them

maxfischer2781 commented 8 years ago

This is unintended. The intention is for files to be considered regardless where they come from, unless they are outdated. The cache metadata is stored in a git-friendly way on purpose.

I'm guessing this is caused by storing dependencies as absolute paths. Note that the cache will report why it needs regeneration if its log level is set low enough. Running excalibur.py with --log-level cache:info core:info conf:info should allow debugging this.

dhaitz commented 8 years ago

Hi Max, thanks for the quick reply! I've found that the cache does not check the presence of the actually needed (lumi/pileup) files, but the cache (.pkl) file. Also, because of https://github.com/artus-analysis/Excalibur/blob/master/cfg/python/configutils.py#L199 the .pkl lumi files are stored in git in data/json/, while the pileup .pkl files are in cache/. I'm not sure whether the .pkl files are supposed to be in git ... Personally I'd put them in cache/ and have the caching mechanism look for the actual pu/lumi files

maxfischer2781 commented 8 years ago

The .pkl files must be in git as well. It includes the metadata of the cache. When invoked, the cache must check the current state of dependencies against the old state of dependencies. This old state is stored in the .pkl. Implicitly, the cache does check the JSON - both original and derived JSONs are dependencies (e.g. here and here). If the .pkl does not exist, the cache will break at this point already, before checking the JSON.

I agree that having some things in cache/ and others in data/ is not clean. Especially figuring out what to commit and what not. I'll write up how this could be resolved soon.

maxfischer2781 commented 8 years ago

If somebody has the time (*hint*hint) there's an alternative: have a global and local cache mode. The main problem is that putting cache/generated files into git is very messy. Both git and cache do a kind of VCS, but the first is persistent while the later is volatile. What is actually needed is a way to share caches.

The functionality for this is already there, but it needs cleanup and consolidation. Required changes:

def cached_query(*args, **kwargs):
    try:
        cache_dir = os.path.join(os.environ['EXCALIBURCACHE'], cache_dir)
    except KeyError:  # thrown by os.environ
        # local cache
        cache_dir = os.path.join(getPath(), "cache", cache_dir)
        # convert *true* relative paths
        dependency_files = [get_relsubpath(dpath) for dpath in dependency_files]
        dependency_folders = [get_relsubpath(dpath) for dpath in dependency_folders]
    else:  # this triggers if no exception is raised
        # global cache
        # purge relative paths
        dependency_files = [os.path.abspath(dpath) for dpath in dependency_files]
        dependency_folders = [os.path.abspath(dpath) for dpath in dependency_folders]
    # define this once cache_dir is set
    def stat_file(file_path):
        """Get a comparable representation of file validity"""
        try:
            file_stat = os.stat(file_path % {'cache_dir': cache_dir})  # resolve cache_dir dynamically
            return file_stat.st_size, file_stat.st_mtime
        except OSError:
            return -1, -1
    # run the rest of cached_query
    # ...

I guess that should do the trick, but there's only so much coffee in one day...

dhaitz commented 8 years ago

For now, I stuck to a simple fix :) https://github.com/artus-analysis/Excalibur/commit/6b88c754 more advanced solutions maybe after holiday

maxfischer2781 commented 8 years ago

Since both Dominik and me are no longer at EKP, please let me know if there is help required on this. I don't have a working test environment anymore.