ecmwf / pyflow

A high level Python interface to ecFlow allowing the creation of ecFlow suites in a modular and "pythonic" way
https://pyflow-workflow-generator.readthedocs.io/en/latest/
Apache License 2.0
7 stars 7 forks source link

[question] Why is the task script wrapped in a `nopp/end` block? #14

Closed kinow closed 6 months ago

kinow commented 1 year ago

Hi,

Today I tried to execute some PyFlow generated suite that used variables like ECF_HOME, but noticed these values were not being processed. Then I realized the script in ecflow_ui was added by PyFlow after a nopp.

I couldn't find anything about in the docs, only in the notebooks, e.g., where it says

The script proper is placed within a %nopp / %end pair. As such, explicit access to ecFlow pre-processing is not available in the script object.

Is there a workaround if my script depends on variables like ECF_HOME, ECF_TRYNO, or user created variables like CHUNK, WORKFLOW_GRAPH, etc?

Cheers, Bruno

corentincarton commented 1 year ago

Hi @kinow, indeed by design in pyflow, you can't use ecflow variable syntax in the user defined script (i.e. the part that is not generated by pyflow). The goal is to be able to test parts of scripts in isolation on the command line, which you can't do with ecflow variables. You can still use ecflow variables, but you need to use them as regular shell variables in the script ($CHUNK, $WORKFLOW_GRAPH, etc.). Pyflow will scan the script for shell variables and try to match them with ecflow variables. If an ecflow variable matches the shell variable, pyflow will add the following before the %nopp region:

export CHUNK ="% CHUNK%"
export WORKFLOW_GRAPH ="% WORKFLOW_GRAPH%"

Note that we still don't support ecflow generated variables (such as $ECF_DATE, $ECF_HOST, $FAMILY, etc.) but we will in the future.

Cheers, Corentin

kinow commented 1 year ago

Hi @corentincarton ,

Hi @kinow, indeed by design in pyflow, you can't use ecflow variable syntax in the user defined script (i.e. the part that is not generated by pyflow). The goal is to be able to test parts of scripts in isolation on the command line, which you can't do with ecflow variables.

:+1:

You can still use ecflow variables, but you need to use them as regular shell variables in the script ($CHUNK, $WORKFLOW_GRAPH, etc.). Pyflow will scan the script for shell variables and try to match them with ecflow variables. If an ecflow variable matches the shell variable, pyflow will add the following before the %nopp region:

Ah, got it. So anything I specify in the Task('name', VAR1, VAR2, ...) can be accessed as $VAR1, $VAR2 and Pyflow will take care of assigning the value in the script.

Note that we still don't support ecflow generated variables (such as $ECF_DATE, $ECF_HOST, $FAMILY, etc.) but we will in the future.

Hmmm, at the moment one variable that I need is ECF_HOME, as I am trying to run an Autosubmit workflow on ecFlow, and it uses an Autosubmit variable (ROOTDIR).

But I think the $CWD when my script is executed points to the same place as ECF_HOME. That might work for now, I think.

Thanks a lot for your reply! -Bruno

corentincarton commented 1 year ago

Yes, for the moment it's best to try to replicate the generated variables. We just need to go through the list of generated ecflow variables and add them to the supported variables in pyflow. I'll try to do that in the next release.

colonesej commented 1 year ago

A workaround for this is to create a dummy variable on your node that points to a generated variable and sets its value to the ecflow_variable. like

pyflow.Task('t1', variables={'MYHOME': '%ECF_HOME%})

pyflow will create

export MYHOME=%MYHOME%

%nopp
echo $MYHOME

and ecflow will have a variable

MYHOME=%ECF_HOME%

It's actually good because if you change your ECF_HOME on your server configuration your script will still work.