sphuber / aiida-shell

AiiDA plugin that makes running shell commands easy.
MIT License
14 stars 7 forks source link

Control caching #81

Closed giovannipizzi closed 9 months ago

giovannipizzi commented 9 months ago

I have this usecase where I turned on caching by default. Now, I am running twice a ShellJob that checks the existence of something on disk (say, a file; but it could be the existence of a package in a venv). In between the two runs, I create that file (or install that package).

Clearly, since the inputs of the test are the same (from the AiiDA point of view), the second run is cached (and took some time to me to realize it :-) great that we now have the tickbox in verdi process list!)

I now decided to run everything inside a with disable_caching() block. However, I'd rather have some way (e.g. something in metadata options, or some specific input of a ShellJob) that allows me to enable or disable caching for a specific job.

I think this is important as since a ShellJob is very general, only who runs it would know if that specific run is cacheable or not. Is this already possible and how (and if so, shall one add an example to the docs), or can this be added?

sphuber commented 9 months ago

That makes sense, but I think that is actually a question for aiida-core. This feature was already requested before, see this issue https://github.com/aiidateam/aiida-core/issues/5102 As you can see in my final comment, I think it makes sense and would be willing to add the implementation. Essentially, I think we should add an option to the CalcJob class like disble_cache, that when set to True, the cache is disabled for that calc, regardless for any caching config or local caching context managers.

bilke commented 7 months ago

Dear @sphuber I tried the following:

results, node = launch_shell_job(
    "some_tool",
    arguments="some_arg",
    submit=True,
    parser=some_parser,
    metadata={
        "options": {
            "computer": computer,
        },
        "disable_cache": True,  # <==
    },
)

But get the following error:

ValueError: Error occurred validating port 'inputs.metadata': Unexpected ports {'disable_cache': True}, for a non dynamic namespace

How can I use the disable_cache metadata with launch_shell_job()? Thanks a lot!

sphuber commented 7 months ago

This is because that feature hasn't been released yet. You would have to install the main branch of the aiida-core repo. Note that that is typically not recommended for production databases as there can be intermediate changes that are not guaranteed to be backwards compatible when the actual release is made. This is especially the case when it comes to caching and database migrations. There are quite a few changes to the caching mechanism on the current main branch that will be released soon with v2.6. So if you move to main, the disable_cache option will be possible, but your old calculations may no longer be valid cache sources.

bilke commented 7 months ago

Ah I see... thanks a lot!