Closed tomdemeyere closed 5 months ago
Thank you for the report, @tomdemeyere! I haven't seen this behavior before, and it's not immediately obvious to me yet what the root cause is.
I tried a minimal example as a sanity check:
import parsl
from ase.build import bulk
from quacc.recipes.emt.slabs import bulk_to_slabs_flow
from quacc import job
parsl.load()
atoms = bulk("Cu")
future = bulk_to_slabs_flow(atoms,job_decorators={"relax_job": job(), "static_job": job()})
future.result()
This, thankfully, works as expected. I can probably help debug further with a minimal (local) example, but at the moment the cause of the issue evades me.
I have not tried the multi-executor behavior in Parsl before, so that might be an avenue to double-check.
Ah! your example works on my machine as well.
Although, changing bulk_to_slabs_flow
to the grid_phonon_flow
still fails for me (same error as before), can you try to reproduce it?
If you provide a minimal reproducible example, I'd be happy to run it and try to sort out the root cause. 👍
import parsl
from ase.build import bulk
from quacc import job
from quacc.recipes.espresso.phonons import grid_phonon_flow
parsl.load()
atoms = bulk("Cu")
future = grid_phonon_flow(
atoms,
job_params={
"relax_job": {"occupations": "smearing", "degauss": 0.01, "smearing": "cold"}
},
job_decorators={"relax_job": job(), "ph_init_job": job()},
)
future.result()
Should throw something like: TypeError: 'AppFuture' object is not callable
(Also it seems that there is a wrong type hint in _grid_phonon_subflow?)
@subflow
def _grid_phonon_subflow(
ph_input_data: dict | None,
ph_init_job_results: str | Path, # <<< that is probably not correct?
ph_job: Job,
nblocks: int = 1,
) -> list[RunSchema]:
@tomdemeyere --- I think I fixed it? Could you please try again with main
and let me know how it goes?
I am going to close this for now since I think it's good. If not, please re-open and ping me!
Sorry for the delay I have plenty to do these days
TypeError: 'AppFuture' object is not callable
seems to be fixed, although I still get parsl.errors.NoDataFlowKernelError: Must first load config
having a quick look it seems that some jobs are not launching on the correct executor, parsl says
Task 50 will be sent to executor pw
Task 50 submitted for App _ph_recover_job
But I did define the executors
grid_phonon_decorators = {
"relax_job": job(executors=["pw"]),
"ph_init_job": job(executors=["ph_init"]),
"ph_job": job(executors=["ph"]),
"ph_recover_job": job(executors=["ph_recover"]),
}
I will find time to send a minimal example, although this is simply the previous one that I sent but with specified executors
Thanks for the follow-up! I'll reopen this given that comment. I will try to look at this when I have time as well.
@tomdemeyere: Regarding #1182, I would like to ensure that it is indeed the ideal way to fix this issue, and I am not yet convinced it is.
If you can prepare a complete, minimal reproducible example that I can run locally with Parsl that would be ideal. The example you shared requires setting some executors, and I don't quite know how to do that offhand. I would like to investigate this issue further. It's clear it's related to the fact that you have modified the phonon_job
's decorator instead of the _ph_recover_job
's decorator, but the only way I can be sure that I am going down the right path is to have as minimal an example as possible to reproduce things on my local Linux machine. Thanks!
To be clear, I don't think this is really a Parsl decorator issue or a customize_funcs
issue. If you tried to modify the _ph_recover_job
's parameters (if it had ones to modify), you'd likely run into a similar error because you are not modifying the proper function. So, I want to make sure we fix the root cause rather than a bandaid solution. The @subflow
approach you took "works" but in a very roundabout way.
Using ThreadPoolExecutor it seems that Parsl does not print which executor is used for which task, you might be able to see it by placing smart print statement inside Quacc?
import parsl
from ase.build import bulk
from parsl.config import Config
from parsl.executors import ThreadPoolExecutor
from quacc import job
from quacc.recipes.espresso.phonons import grid_phonon_flow
CONFIG = Config(
executors=[
ThreadPoolExecutor(
label="pw",
max_threads=1,
),
ThreadPoolExecutor(
label="ph_init",
max_threads=1,
),
ThreadPoolExecutor(
label="ph",
max_threads=1,
),
ThreadPoolExecutor(
label="ph_recover",
max_threads=1,
),
],
)
pw_input_data = {
"control": {"forc_conv_thr": 0.0001},
"system": {
"occupations": "smearing",
"degauss": 0.01,
"smearing": "cold",
},
"electrons": {"conv_thr": 1e-08, "mixing_mode": "TF", "mixing_beta": 0.7},
"ions": {},
"cell": {"press_conv_thr": 0.1},
}
ph_input_data = {
"inputph": {
"tr2_ph": 1.0e-8,
"alpha_mix(1)": 0.1,
"nmix_ph": 8,
"ldisp": True,
"nq1": 1,
"nq2": 1,
"nq3": 1,
},
}
grid_phonon_params = {
"relax_job": {
"input_data": pw_input_data,
"kspacing": 1.0,
"relax_cell": False,
},
"ph_job": {
"input_data": ph_input_data,
},
}
parsl.load(CONFIG)
grid_phonon_decorators = {
"relax_job": job(executors=["pw"]),
"ph_init_job": job(executors=["ph_init"]),
"ph_job": job(executors=["ph"]),
"ph_recover_job": job(executors=["ph_recover"]),
}
future = grid_phonon_flow(bulk("Si", cubic=True), job_params=grid_phonon_params, job_decorators=grid_phonon_decorators)
future.result()
Thanks, @tomdemeyere. It is still a bit difficult for me to debug with this because running the code as-is yields a successful output. I'm happy to debug this if a minimal example is provided that clearly raises an error that would be resolved with a patch.
Strictly speaking there is no error anymore, Parsl just use the wrong executor... let me think about it
What would you like to report?
I find myself in the impossibility to run the
quacc.recipes.espresso.phonons.grid_phonon_flow
with custom decorators by doing:Parsl will throw:
parsl.errors.NoDataFlowKernelError: Must first load config
, other basic jobs that I redecorate manually withredecorate
work fine, any thoughts here?